PhpRiot
Download This Article
Download this article in PDF format with all listings and files.

Price: $5.00 AUD
(Approx. $4.10 USD)

More information
Related Books
Professional Search Engine Optimization with PHP: A Developer's Guide to SEO

Professional Search Engine Optimization with PHP: A Developer's Guide to SEO

Maybe you’re a great programmer or IT professional, but marketing isn’t your thing. Or...

Search Engine Optimization for Flash: Best practices for using Flash on the web

Search Engine Optimization for Flash: Best practices for using Flash on the web

Some people think that Flash-based websites and Rich Internet Applications won't show up in a web...
Browse Articles
Ajax (4), APC (1), CAPTCHA (1), CSS (3), Debugging (1), File Upload (1), Google (3), Google Maps (2), JavaScript (12), JSON (2), MVC (1), MySQL (7), onbeforeunload (1), OOP (1), PHP (28), PhpDoc (1), PostgreSQL (6), Prototype (11), Reflection (1), RFC 1867 (1), Robots (1), Scriptaculous (1), SEO (1), Sessions (1), SimpleXML (1), Smarty (5), SOAP (1), SPL (1), Templates (2), W3C (1), XHTML (1), Zend Framework (1), Zend_Search_Lucene (1)

PhpRiot Newsletter
Your Email Address:

Creating Search Engine Friendly URLs In PHP

Using A Custom 404 Handler

This is probably the most complicated method of achieving this result, however, it is also the simplest to expand upon and is more powerful.

By taking advantage of Apache’s custom 404 handler, you can have a single controlling script that decides how all requests are handled. Of course, this is only requests that do not match an existing file. For example, if you have images on your web site, you can still access them in an identical fashion — the image file will exist, hence the 404 handler will not be used.

Additionally, by taking advantage of PHP’s header() function, you can output, say, a 200 OK header rather than a 404 File Not Found header, so from the end user’s point of view, they have no idea the page wasn’t really found.

An example of where this would be used

Taking this idea further, you probably wouldn’t bother implementing a system like this for just a news handling engine as in our examples, but rather, on a larger site that has a lot more content.

For example, if you look at the following URL: http://www.phpriot.com/d/articles/php/index.html. PhpRiot is actually using the ‘ForceType’ method with a PHP script called ‘d’ that handles requests, but we could have implemented it using this method.

Note: This URL was used in a previous version of PhpRiot - the URL scheme of the site no longer works in this fashion.

Suppose we wanted our URL to look like this: http://www.phpriot.com/articles/php/index.html. Instead of creating this path on our web server for each and every article, we would use the 404 handler to parse out the article path like we currently do with our d file.

Implementing the 404 handler

We’re not going to implement the example listed above as it involves other complexities not relevant to this article, so instead, we’ll implement our news article example. We’re also going to add in scope to handle other requests (other than news) and also for outputting error pages.

The first thing to do would be to setup the 404 handler. This can be done either in a .htaccess or in the httpd.conf.

Listing 8 .htaccess
ErrorDocument 404 /handler.php

This means that all requests that don’t match an existing file, are passed to the handler.php script in our web root.

So in this script, we need to parse out the request. You can find the original request in the server REDIRECT_URL variable.

Listing 9 handler.php
<?php
    $request = $_SERVER['REDIRECT_URL'];
 
    // explode on / to find all the different request parts
    $parts = explode('/', $request);
 
    // flag to determine whether or not we've found content
    $found = false;
 
    // the first element will be empty to we get rid of it
    array_shift($parts);
 
    // now we determine the type of content
    switch ($parts[0]) {
        case 'news':
            // use a very similar regex to our previous example
            preg_match('!^(\d+)\.html$!', $parts[1], $matches);
            $news_id = (int) $matches[1];
 
            $output = getNewsArticle($news_id);
                // this function doesn't really exist, but if it
                // did it would return the news content if article
                // found, or return null if not
 
            if ($output !== null)
                $found = true;
 
            break;
 
        case 'articles':
            // here we would implement a handler to display a document,
            // say if they accessed http://www.example.com/documents/1234.html
 
            break;
 
        default:
 
    }
 
    if ($found) {
        // output a header to say the content exists, other a 404 will be sent
        header('HTTP/1.1: 200 OK');
        echo $output;
    }
    else {
        // no content was found. this should be automatically sent by the
        // server anyway, but we'll specify anyway just in case
        header('HTTP/1.0 404 Not Found');
        echo 'File not found';
    }
?>

Obviously this script is slightly crude, but hopefully in its simplicity you can see how powerful this method can be and what possibilities it can open.

In This Article


Tagged in ,

Article History

Jan 10, 2006
Initial article version
Feb 27, 2008
Added the "Using mod_rewrite as a 404 Handler" page to the article