You are here

Add a page_not_found() hook to Drupal

I have only just started this but I wanted to post for opinion. In IRC I was pushed towards using hook_page_alter() instead of defining a new hook. However hook_page_alter() runs after the 404 header has been sent. I don't expect the code as is to get in. Largely I am posting this as a start because I work on large sites (Economist.com) and I don't want to have to use nfs to share the files directory as it adds another single point of failure in the stack.

Background for wanting hook_404()

The approach that Vignette and other large cm systems take is as follows.

Server X and Y are physical or logical Drupal servers that share the same codebase and database.

1) User A uploads an image ( sites/default/dog.png ) to server X to be included onto node/1234.
2) User B loads node/1234 on server Y. When the request comes into load sites/default/dog.png it fires a Drupal 404 page. If hook_404() exists a module of some sort could look up whether the file exists in the files table and go collect the image from the correct server, write the image locally and either serve the image or redirect to it. I am intentionally vague on how the module would collect the image from Server Y or Z or Q as this is an example use case.
2) User C loads node/1234 on server Y. When the request comes into load sites/default/dog.png the file exists and is served by the webserver. Apache never gets involved.

This approach doesn't need each upload to jump through hoops to put each file uploaded on each server which is fast for the person uploading. It doesn't require a cron job to push the relevant files around which would have time lag issues anyway.

ImageCache registers the callback for imagecache/% if I remember correctly but I think something using a hook_404() could capture any missing URL. In fact imagecache in D7 could well use this approach. I imagine I could have a module with the callback for sites/% to work in the same way as ImageCache but it seems less glamorous.

Comments

I need the same feature but for another reason. In this case, we have two sites, example.com and subsite.example.com. If users type example.com/some-path, and it 404's, we want to check to see if it exists on subsite.example.com, and redirect there. Since Drupal sets the header before calling the path in the site404 variable, it's impossible to redirect as a header has all ready been set.

Two thoughts at the moment:

I think the hook should be called hook_page_not_found(), instead of hook_404()

Do we want modules to be able to override the return value of menu_execute_active_handler()? In my specific case, it wouldn't matter since the module would end in a drupal_goto() call, but I could imagine other cases where a module may not want MENU_NOT_FOUND to be returned

Secondly, as far as I know none of the existing hook_*() functions are called directly. I believe you need to perform the actions in your hook_404() function directly and you can use module_invoke_all.

So it becomes something like:

<?phpfunction hook_page_not_found($path = NULL) {// We can leave this empty. hook_access() has a basic implementation but it's not // called directly so I don't see the use.}

I've changed the code to support the two comments. Additionally in the patch there is a space removed from the end of the line in theme_menu_tree() as my eclipse is set up to remove end of line spaces. I could pull that out if needed but I don't see the point yet.

"- * - leaf: The menu item has no submenu. ""+ * - leaf: The menu item has no submenu."

Also I've created a very small sample module that would use the hook and can be edited to return different page states, like access denied or page not found e.t.c.

I'm able to write my module for D7 using the built-in API's now that we have drupal_static(). The only catch is that you have to remember to unset($_REQUEST['destination']) before calling drupal_goto(), otherwise you get redirected to the previous destination. The following seems to work fine when set as the callback for the site404 variable:

The general scheme in comment #11 works except that I had to use $_GET['destination'] instead of $_REQUEST['destination'].

EDIT: Also, doing the redirect as part of the default 404 page results in Drupal logging the original access as a page-not-found error even when we never return a 404 to the browser/client. Seems like a real hook would be better, as in the originally proposed patch.