Alternative to include/require for autoloaders

This RFC aims to offer an alternative solution to the well known fopen() “hack” used in autoloaders in order to verify the existence of files inside the include path.

Introduction

Currently many autoloaders require the use of inefficient and in extreme cases even error prone “hacks” to be able to handle syntax errors differently than missing files when using include/require.

Current situation

Many autoloaders (phd, ZF, PEAR2) use the following “hack” to determine of a file exists before loading it:

Just “blindly” including the file makes it impossible to determine if there was a syntax error or if the file was missing. In some cases when integrating files from different projects there might be different naming conventions which makes it necessary to attempt different possible filenames without suppressing potential parse errors.

Furthermore the above code first opens the file, closes the file and then finally includes the file. This means there are multiple function calls, multiple filesystem calls and even an very low risk for a race condition.

An alternative approach is to read the include path setting and iterate over all the directories using file_exists(). This is overly expensive, especially with larger include paths. Furthermore it also suffers from the potential race condition, even if the “use_include_path” flag from fopen() would be added to file_exists() - though adding this flag is probably not desirably for the reason of adding another place where the ini settings are read.

Proposal

In order to solve the above issues this RFC proposes the addition of a new construct/function for now called “autoload_include” for lack of a better name that largely behaves like the “include” does today with the following differences, that when the include failed because of a missing file no warning is raised and php null is returned.

The current working title “autoload_include” is probably not ideal and should probably be replaced with a more meaningful name.

Optional related aspects

A potential interesting additional difference could be to return the name of the loaded file in case the file was loaded successfully and the file does not return any value explicitly. This could make it possible to assist in caching the lookup. Potentially the “spl” prefix should be used as well.

Alternative name proposals:

include_silent

contain

superset

import

load

Alternative proposals

Add stream support to include/require

However Stas notes that this would “this would break security distinction between file ops and include ops, when URLs are allowed for open but not include”, but Greg notes this should be solvable since “the wrapper used to open the file pointer is stored in the resource, so we can just check it against the same restrictions we would for static urls”. That being said Stas additionally notes that this could pose challenges for byte code caches since they would have to “watch all file opens, in case some of these will later be used for include” as well as “somehow be able to get filename back from open stream to get the cached file”.

Add function to resolve the include path

Either add a “use_include_path” flag to file_exists() or add a new file_find() function that supports searching the include path and returns the absolute path to the first match or false if none could be found. The later would be superior since it would also enable the optional feature mentioned above. It would still however require file system access twice, though it would only have to resolve the include path once. It also does not solve the edge case of a race condition, which however seems very rare for files that are included via autoload. There is actually already an implementation which was put into HEAD aka PHP6 as the result of a previous discussion.

That being said the current implementation needs some tweaks as Greg points out:
“stream_resolve_include_path() as currently constructed could not be intercepted, and is actually unable to process an include_path that contains streams. I'm guessing it was written long before PHP 5.3. This could be easily fixed by having stream_resolve_include_path call zend_resolve_path() instead of doing its own internal calculations. With these changes, an opcode cache could easily cache the results.”