The layout of the Manual

Now we'll start out with some brief notes about what File::Util is (and isn't), then we'll talk about the syntax used in File::Util. After that we discuss custom error handling and diagnostics in File::Util. Finally, the rest of this document will cover File::Util's object methods, one by one, with brief usage examples.

What File::Util Is

File::Util is a "Pure Perl" library that provides you with several easy-to-use tools to wrangle files and directories. It has higher order methods (that's fancy talk for saying that you can feed subroutine references to some of File::Util's object methods and they will be treated like "callbacks").

File::Util is mainly Object-Oriented Perl, but strives to be gentle and accommodating to those who do not know about or who do not like "OO" interfaces. As such, many of the object methods available in File::Util can also be imported into your namespace and used like regular subroutines to make short work of simple tasks.

For more advanced tasks and features, you will need to use File::Util's object-oriented interface. Don't worry, it's easy, and there are plenty of examples here in the documentation to get you off to a great and productive start. If you run into trouble, help is available.

File::Util tries its best to adhere to these guiding principles:

Be easy

Make hard things easier and safer to do while avoiding common mistakes associated with file handling in Perl. Code using File::Util will automatically be abiding by best practices with regard to Perl IO.

File::Util makes the right decisions for you with regard to all the little details involved in the vast majority of file-related tasks. File locking is automatically performed for you! File handles are always lexically scoped. Safe reads and writes are performed with hard limits on the amount of RAM you are allowed to consume in your process per file read. (You can adjust the limits.)

Be portable

We make sure that File::Util is going to work on your computer or virtual machine. If you run Windows, Mac, Linux, BSD, some flavor of Unix, etc... File::Util should work right out of the box. There are currently no platforms where Perl runs that we do not support. If Perl can run on it, File::Util can run on it. If you want unicode support, however, you need to at least be running Perl 5.8 or better.

Be compatible

File::Util has been around for a long time, and so has Perl. We'd like to think that this is because they are good things! This means there is a lot of backward-compatibility to account for, even within File::Util itself.

In the last several years, there has never been a release of File::Util that intentionally broke code running a previous version. We are unaware of that even happening. File::Util is written to support both old and new features, syntaxes, and interfaces with full backward-compatibility.

Be helpful

If requested, File::Util outputs extremely detailed error messages when something goes wrong in a File::Util operation. The diagnostic error messages not only provide information about what went wrong, but also hints on how to fix the problem.

These error messages can easily be turned on and off. See DIAGNOSTICS for the details.

Be Pure

File::Util uses no XS or C underpinnings that require you to have a compiler or make utility on your system in order to use it. Simply follow standard installation procedures (INSTALLATION) and you're done. No compiling required.

What File::Util Is NOT

File::Util offers significant performance increases over other modules for most directory-walking and searching, whether doing so in a single directory or in many directories recursively. (See also the benchmarkingand profiling scripts included in the performance subdirectory as part ofthis distribution)*

However File::Util is NOT a single-purpose file-finding/searching utility like File::Find::Rule which offers a handful of extra built-in search features that File::Util does not give you out of the box, such as searching for files by owner/group or size. It is possible to accomplish the same things by taking advantage of File::Util's callbacks if you want to, but this isn't the "one thing" File::Util was built to do.

*Sometimes it doesn't matter how fast you can search through a directory 1000times. Performance alone isn't the best criteria for choosing a module.

SYNTAX

In the past, File::Util relied on an older method invocation syntax that was not robust enough to support the newer features that have been added since version 4.0. In addition to making new features possible, the new syntax is more in keeping with what the Perl community has come to expect from its favorite modules, like Moose and DBIx::Class.

If you already have code that uses the old syntax, DON'T WORRY -- it's still fully supported behind the scenes. However, for new code that takes advantage of new features like higher order functions (callbacks), or advanced matching for directory listings, you'll need to use the syntax as set forth in this document. The old syntax isn't covered here, because you shouldn't use it anymore.

An Explanation Of The "Options Hashref"

As shown in the code example above, the new syntax uses hash references to specify options for calls to File::Util methods. This documentation refers to these as the "options hashref". The code examples below illustrates what they are and how they are used. Advanced Perl programmers will recognize these right away.

NOTE: "hashref" is short for "hash reference". Hash references use curly brackets and look like this:

ERROR HANDLING

Feature Summary

Managing potential errors is a big part of Perl IO. File::Util gives you several options. In fact, every single call to a File::Util method which accepts an "options hashref" can also include an error handling directive. File::Util has some pre-defined error handling behaviors that you can choose from, or you can supply your own error handler routine. This is accomplished via the onfail option.

As an added convenience, when you use this option with the File::Util constructor method, it sets the default error handling policy for all failures; in other words, you can set up one error handler for everything and never have to worry about it after that.

Details

This is what File::Util already does: it calls CORE::die() with an error message when it encounters a fatal error, and your program terminates.

Example:

my $ftl = File::Util->new( ... { onfail => 'die' } );

keyword: zero

When you use the predefined zero behavior as the onfail handler, File::Util will return a zero value (the integer 0) if it encounters a fatal error, instead of dying. File::Util won't warn about the error or abort execution. You will just get a zero back instead of what you would have gotten otherwise, and execution will continue as if no error happened.

Example:

my $content = File::Util->load_file( ... { onfail => 'zero' } );

keyword: undefined

When you use the predefined undefined behavior as the onfail handler, if File::Util runs into a fatal error it will return undef. Execution will not be aborted, and no warnings will be issued. A value of undef will just get sent back to the caller instead of what you would have gotten otherwise. Execution will then continue on as if no error happened.

Note: This option usually makes more practical sense than onfail => 'zero'

When you use the predefined warn behavior as the onfail handler, File::Util will return undef if it encounters a fatal error, instead of dying. Then File::Util will emit a warning with details about the error, but will not abort execution. You will just get a warning message sent to STDERR and undef gets sent back to the caller instead of what you would have gotten otherwise. Other than the warning, execution will continue as if no error ever happened.

Example:

my $write_ok = File::Util->write_file( ... { onfail => 'warn' } );

keyword: message

When you use the predefined message behavior as the onfail handler, if File::Util runs into a fatal error it will return an error message in the form of a string containing details about the problem. Execution will not be aborted, and no warnings will be issued. You will just get an error message sent back to the caller instead of what you would have gotten otherwise. Execution will then continue on as if no error happened.

Example:

my @files = File::Util->list_dir( ... { onfail => 'message' } );

subroutine reference

If you supply a code reference to the onfail option in a File::Util method call, it will execute that code if it encounters a fatal error. You must supply a true code reference, as shown in the examples below, either to a named or anonymous subroutine.

The subroutine you specify will receive two arguments as its input in "@_". The first will be the text of the error message, and the second will be a stack trace in text format. You can send them to a logger, to your sysadmin in an email alert, or whatever you like-- because it is *your* error handler.

WARNING! If you do not call die or exit at the end of your error handler,File::Util will NOT exit, but continue to execute. When you opt to use this feature, you are fully responsible for your process' error handling and post-error execution.

DIAGNOSTICS

When things go wrong, sometimes it's nice to get as much information as possible about the error. In File::Util, you incur no performance penalties by enabling more verbose error messages. In fact, you're encouraged to do so.

You can globally enable diagnostic messages (for every File::Util object you create), or on a per-object basis, or even on a per-call basis when you just want to diagnose a problem with a single method invocation. Here's how:

METHODS

Note: In the past, some of the methods listed would state that they were autoloaded methods. This mechanism has been changed in favor of more modern practices, in step with the evolution of computing over the last decade since File::Util was first released.

Methods listed in alphabetical order.

atomize_path

Syntax:atomize_path( [/file/path or file_name] )

This method is used internally by File::Util to handle absolute filenames on different platforms in a portable manner, but it can be a useful tool for you as well.

This method takes a single string as its argument. The string is expected to be a fully-qualified (absolute) or relative path to a file or directory. It carefully splits the string into three parts: The root of the path, the rest of the path, and the final file/directory named in the string.

Depending on the input, the root and/or path may be empty strings. The following table can serve as a guide in what to expect from atomize_path()

bitmask

Syntax:bitmask( [file name] )

Gets the bitmask of the named file, provided the file exists. If the file exists and is accessible, the bitmask of the named file is returned in four digit octal notation e.g.- 0644. Otherwise, returns undef if the file does not exist or could not be accessed.

can_flock

Syntax:can_flock

Returns 1 if the current system claims to support flock()and if the Perl process can successfully call it. (see "flock" in perlfunc.) Unless both of these conditions are true, a zero value (0) is returned. This is a constant method. It accepts no arguments and will always return the same value for the system on which it is executed.

Note: Perl tries to support or emulate flock whenever it can via available system calls, namely flock; lockf; or with fcntl.

created

Syntax:created( [file name] )

Returns the time of creation for the named file in non-leap seconds since whatever your system considers to be the epoch. Suitable for feeding to Perl's built-in functions "gmtime" and "localtime". (see "time" in perlfunc.)

diagnostic

Syntax:diagnostic( [true / false value] )

When called without any arguments, this method returns a true or false value to reflect the current setting for the use of diagnostic (verbose) error messages when a File::Util object encounters errors.

When called with a true or false value as its single argument, this tells the File::Util object whether or not it should enable diagnostic error messages in the event of a failure. A true value indicates that the File::Util object will enable diagnostic mode, and a false value indicates that it will not. The default setting for diagnostic() is 0 (NOT enabled.)

default_path

Syntax:default_path( [string, string] )

The second string argument is optional.

Works just like strict_path, except that instead of returning undef when the argument passed in doesn't look like a path, it will return a default string instead. The default string returned will either be the built-in default path, or the string you specify as a second argument to this method.

ebcdic

Syntax:ebcdic

Returns 1 if the machine on which the code is running uses EBCDIC, or returns 0 if not. (see perlebcdic.) This is a constant method. It accepts no arguments and will always return the same value for the system on which it is executed.

escape_filename

Syntax:escape_filename( [string], [escape char] )

Returns it's argument in an escaped form that is suitable for use as a filename. Illegal characters (i.e.- any type of newline character, tab, vtab, and the following / | * " ? < : > \), are replaced with [escape char] or "_" if no [escape char] is specified. Returns an empty string if no arguments are provided.

existent

Syntax:existent( [file name] )

Returns 1 if the named file (or directory) exists. Otherwise a value of undef is returned.

This works the same as Perl's built-in -e file test operator, (see "-X" in perlfunc), it's just easier for some people to remember.

file_type

Syntax:file_type( [file name] )

Returns a list of keywords corresponding to each of Perl's built in file tests (those specific to file types) for which the named file returns true. (see "-X" in perlfunc.)

The keywords and their definitions appear below; the order of keywords returned is the same as the order in which the are listed here:

PLAIN File is a plain file.

TEXT File is a text file.

BINARY File is a binary file.

DIRECTORY File is a directory.

SYMLINK File is a symbolic link.

PIPE File is a named pipe (FIFO).

SOCKET File is a socket.

BLOCK File is a block special file.

CHARACTER File is a character special file.

flock_rules

Syntax:flock_rules( [keyword list] )

Sets I/O race condition policy, or tells File::Util how it should handle race conditions created when a file can't be locked because it is already locked somewhere else (usually by another process).

An empty call to this method returns a list of keywords representing the rules that are currently in effect for the object.

Otherwise, a call should include a list containing your chosen directive keywords in order of precedence. The rules will be applied in cascading order when a File::Util object attempts to lock a file, so if the actions specified by the first rule don't result in success, the second rule is applied, and so on.

This setting can be dynamically changed at any point in your code by calling this method as desired.

The default behavior of File::Util is to try and obtain an exclusive lockon all file opens (if supported by your operating system). If a lock cannotbe obtained, File::Util will throw an exception and exit.

If you want to change that behavior, this method is the way to do it. One common situation is for someone to want their code to first try for a lock, and failing that, to wait until one can be obtained. If that's what you want, see the examples after the keywords list below.

Recognized keywords:

NOBLOCKEX

tries to get an exclusive lock on the file without blocking (waiting)

NOBLOCKSH

tries to get a shared lock on the file without blocking

BLOCKEX

waits to get an exclusive lock

BLOCKSH

waits to get a shared lock

FAIL

dies with stack trace

WARN

warn()s about the error and returns undef

IGNORE

ignores the failure to get an exclusive lock

UNDEF

returns undef

ZERO

returns 0

Examples:

ex- flock_rules( qw( NOBLOCKEX FAIL ) );

This is the default policy. When in effect, the File::Util object will first attempt to get a non-blocking exclusive lock on the file. If that attempt fails the File::Util object will call die() with an error.

ex- flock_rules( qw( NOBLOCKEX BLOCKEX FAIL ) );

The File::Util object will first attempt to get a non-blocking exclusive lock on the file. If that attempt fails it falls back to the second policy rule "BLOCKEX" and tries again to get an exclusive lock on the file, but this time by blocking (waiting for its turn). If that second attempt fails, the File::Util object will fail with an error.

ex- flock_rules( qw( BLOCKEX IGNORE ) );

The File::Util object will first attempt to get a file non-blocking lock on the file. If that attempt fails it will ignore the error, and go on to open the file anyway and no failures or warnings will occur.

is_bin

Syntax:is_bin( [file name] )

Returns 1 if the named file (or directory) exists. Otherwise a value of undef is returned, indicating that the named file either does not exist or is of another file type.

This works the same as Perl's built-in -B file test operator, (see "-X" in perlfunc), it's just easier for some people to remember.

is_readable

Syntax:is_readable( [file name] )

Returns 1 if the named file (or directory) is readable by your program according to the applied permissions of the file system on which the file resides. Otherwise a value of undef is returned.

This works the same as Perl's built-in -r file test operator, (see "-X" in perlfunc), it's just easier for some people to remember.

is_writable

Syntax:is_writable( [file name] )

Returns 1 if the named file (or directory) is writable by your program according to the applied permissions of the file system on which the file resides. Otherwise a value of undef is returned.

This works the same as Perl's built-in -w file test operator, (see "-X" in perlfunc), it's just easier for some people to remember.

last_access

Syntax:last_access( [file name] )

Returns the last accessed time for the named file in non-leap seconds since whatever your system considers to be the epoch. Suitable for feeding to Perl's built-in functions "gmtime" and "localtime". (see "time" in perlfunc.)

last_changed

Syntax:last_changed( [file name] )

Returns the inode change time for the named file in non-leap seconds since whatever your system considers to be the epoch. Suitable for feeding to Perl's built-in functions "gmtime" and "localtime". (see "time" in perlfunc.)

last_modified

Syntax:last_modified( [file name] )

Returns the last modified time for the named file in non-leap seconds since whatever your system considers to be the epoch. Suitable for feeding to Perl's built-in functions "gmtime" and "localtime". (see "time" in perlfunc.)

line_count

Syntax:line_count( [file name] )

Returns the number of lines in the named file. Fails with an error if the named file does not exist.

list_dir

Syntax:list_dir( [directory name] => { option => value, ... } )

Returns all file names in the specified directory, sorted in alphabetical order. Fails with an error if no such directory is found, or if the directory is inaccessible.

Note that this is one of File::Util's most robust methods, and can be very useful. It can be used as a higher order function (accepting callback subrefs), and can be used for advanced pattern matching against files. It can also return a hierarchical data structure of the file tree you ask it to walk.

list_dir() can accept references to subroutines of your own. If you pass it a code reference using this option, File::Util will execute your code every time list_dir() enters a directory. This is particularly useful when combined with the recurse option which is explained below.

When you create a callback function, the File::Util will pass it four arguments in this order: The name of the current directory, a reference to a list of subdirectories in the current directory, a reference to a list of files in the current directory, and the depth (positive integer) relative to the directory you provided as your first argument to list_dir(). This means if you pass in a path such as/var/tmp, that "/var/tmp" is at a depth of 0, "/var/tmp/foo" is 1 deep, and so ondown through the "/var/tmp" directory.

Remember that the code in your callback gets executed in real time, as list_dir() is walking the directory tree. Consider this example:

A d_callback is just like a callback, except it is only executed on directories encountered in the file tree, not files, and its input is slightly different. @_ is comprised of (in order) the name of the current directory, a reference to a list of all subdirectories in that directory, and the depth (positive integer) relative to the top level directory in the path you provided as your first argument to list_dir.

f_callback => subroutine reference

Similarly an f_callback is just like a callback, except it is only concerned with files encountered in the file tree, not directories. It's input is also slightly different. @_ is comprised of (in order) the name of the current directory, a reference to a list of all files present in that directory, and the depth (positive integer) relative to the top level directory in the path you provided as your first argument to list_dir.

dirs_only => boolean

return only directory contents which are also directories

files_only => boolean

return only directory contents which are files

max_depth => positive integer

Works just like the -maxdepth flag in the GNU find command. This option tells list_dir() to limit results to directories at no more than the maximum depth you specify. This only works in tandem with the recurse option (or the recurse_fast option which is similar).

For compatibility reasons, you can use "maxdepth" without the underscore instead, and get the same functionality.

no_fsdots => boolean

do not include "." and ".." in the list of directory contents returned

abort_depth => positive integer

Override the global limit on abort_depth recursions for directory listings, on a per-listing basis with this option. Just like the main abort_depth() object method, this option takes a positive integer. The default is 1000. Sometimes it is useful to increase this number by quite a lot when walking directories with callbacks.

with_paths => boolean

Return results with the preceding file paths intact, relative to the directory named in the call.

recurse => boolean

Recurse into subdirectories. In other words, open up subdirectories and continue to descend into the directory tree either as far as it goes, or until the abort_depth limit is reached. See abort_depth()

recurse_fast => boolean

Recurse into subdirectories, without checking for filesystem loops. This works exactly like the recurse option, except it turns off internal checking for duplicate inodes while descending through a file tree.

You get a performance boost at the sacrifice of a little "safety checking".

The bigger your file tree, the more performance gains you see.

This option has no effect on Windows. (see perldoc -f stat)

dirs_as_ref => boolean

When returning directory listing, include first a reference to the list of subdirectories found, followed by anything else returned by the call.

files_as_ref => boolean

When returning directory listing, include last a reference to the list of files found, preceded by a list of subdirectories found (or preceded by a list reference to subdirectories found if dirs_as_ref was also used).

as_ref => boolean

Return a pair list references: the first is a reference to any subdirectories found by the call, the second is a reference to any files found by the call.

sl_after_dirs => boolean

Append a directory separator ("/, "\", or ":" depending on your system) to all directories found by the call. Useful in visual displays for quick differentiation between subdirectories and files.

ignore_case => boolean

Return items in a case-insensitive alphabetic sort order, as opposed to the default.

**By default, items returned by the call to this method are alphabetically sorted in a case-insensitive manner, such that "Zoo.txt" comes before "alligator.txt". This is also the way files are listed at the system level on most operating systems.

However, if you'd like the directory contents returned by this method to be sorted without regard to case, use this option. That way, "alligator.txt" will come before "Zoo.txt".

count_only => boolean

Returns a single value: an integer reflecting the number of items found in the directory after applying any filter criteria that may also have been specified by other options (i.e.- "dirs_only", "recurse", etc.)

as_tree => boolean

Returns a hierarchical data structure (hashref) of the file tree in the directory you specify as the first argument to list_dir(). Use in combination with other options to get the exact results you want in the data structure.

*Note: When using this option, the "files_only" and "dirs_only" options are ignored, but you can still specify things like a "max_depth" argument, however. Note also that you need to specifically call this with the "recurse" or "recurse_fast" option or you will only get a single-level tree structure.

When using this option, the hashref you get back will have certain metadata entries at each level of the hierarchy, namely there will be two special keys: "_DIR_SELF", and "_DIR_PARENT_". Their values will be the name of the directory itself, and the name of its parent, respectively.

That metadata can be extremely helpful when iterating over and parsing the hashref later on, but if you don't want the metadata, include the dirmeta option and set it to a zero (false) value as shown below:

**Remember: the as_tree doesn't recurse into subdirectories unless you tell it to with recurse => 1

Filtering and Matching with list_dir()

list_dir() can use Perl Regular Expressions to match against and thereby filter the results it returns. It can match based on file name, directory name, the path preceding results, and the parent directory of results. The matching arguments you use must be real regular expression references as shown (i.e.- NOT strings).

Regular expressions can be provided as a single argument value, or a specifically crafted hashref designating a list of patterns to match against in either an "or" manner, or an "and"ed cumulative manner.

Some short examples of proper syntax will be provided after the list of matching options below.

**If you experience a big slowdown in directory listings whileusing regular expressions, check to make sure your regular expressions areproperly written and optimized. In general, directory listings shouldnot be slow or resource-intensive. Badly-written regular expressions willresult in considerable slowdowns and bottlenecks in any application.

files_match => qr/regexp/

OR:files_match => { and/or => [ qr/listref of/, qr/regexps/ ] }

Return only file names matching the regex(es). Preceding directories are included in the results; for technical reasons they are not excluded (if they were excluded, list_dir() would not be able to "cascade" or recurse into subdirectories in search of matching files.

Use the files_only option in combination with this matching parameter to exclude the preceding directory names.

dirs_match => qr/regexp/

OR:dirs_match => { and/or => [ qr/listref of/, qr/regexps/ ] }

Return only files and subdirectory names in directories that match the regex(es) you specify. BE CAREFUL with this one!! It doesn't "cascade" the way you might expect; for technical reasons, it won't descend into directories that don't match the regex(es) you provide. For example, if you want to match a directory name that is three levels deep against a given pattern, but don't know (or don't care about) the names of the intermediate directories-- THIS IS NOT THE OPTION YOU ARE LOOKING FOR. Use the path_matches option instead.

*NOTE: Bear in mind that just because you tell list_dir() to match each directory against the regex(es) you specify here, that doesn't mean you are telling it to only show directories in its results. You will get file names in matching directories included in the results as well, unless you combine this with the dirs_only option.

path_matches => qr/regexp/

OR:path_matches => { and/or => [ qr/listref of/, qr/regexps/ ] }

Return only files and subdirectory names with preceding paths that match the regex(es) you specify.

As shown in the File::Util::Manual::Examples, Perl already provides support for negated matching in the form of "zero-width negative assertions". (See perlre for details on how they work). Use syntax like the regular expressions below to match anything that is NOT part of the subpattern.

load_dir

Syntax:load_dir( [directory name] => { options } )

Returns a data structure containing the contents of each file present in the named directory.

The type of data structure returned is determined by the optional data-type option parameter. Only one option at a time may be used for a given call to this method. Recognized options are listed below.

Implicit. If no option is passed in, the default behavior is to return a reference to an anonymous hash whose keys are the names of each file in the specified directory; the hash values for contain the contents of the file represented by its corresponding key.

as_list => boolean

Causes the method to return a list comprised of the contents loaded from each file (in case-sensitive order) located in the named directory.

This is useful in situations where you don't care what the filenames were and you just want a list of file contents.

as_listref => boolean

Same as above, except an array reference to the list of items is returned rather than the list itself. This is more efficient than the above, particularly when dealing with large lists.

load_dir() does not recurse or accept matching parameters, etc. It's an effective tool for loading up things like a directory of template files on a web server, or to store binary data streams in memory. Use it however you like.

However, if you do want to load files into a hashref/listref or array while using the advanced features of list_dir(), just use list_dir to return the files and map the contents into your variable:

Note: This method does not distinguish between plain files and other file types such as binaries, FIFOs, sockets, etc.

Restrictions imposed by the current "read limit" (see the read_limit()) entry below will be applied to the individual files opened by this method as well. Adjust the read limit as necessary.

Example usage:

my $templates = $f->load_dir( 'templates/stock-ticker' );

The above code creates an anonymous hash reference that is stored in the variable named "$files". The keys and values of the hash referenced by "$files" would resemble those of the following code snippet (given that the files in the named directory were the files 'a.txt', 'b.html', 'c.dat', and 'd.conf')

load_file

Syntax:load_file( [file name] => { options } )

OR:load_file( file_handle => [file handle reference] => { options } )

If [file name] is passed, returns the contents of [file name] in a string. If a [file handle reference] is passed instead, the filehandle will be CORE::read() and the data obtained by the read will be returned in a string.

If you desire the contents of the file (or file handle data) in a list of lines instead of a single string, this can be accomplished through the use of the as_lines option (see below).

Options accepted by load_file()

as_lines => boolean

If this option is enabled then your call to load_file will return a list of strings, each one of which is a line as it was read from the file [file name]. The lines are returned in the order they are read, from the beginning of the file to the end.

This is not the default behavior. The default behavior is for load_file to return a single string containing the entire contents of the file.

no_lock => boolean

By default this method will attempt to get a lock on the file while it is being read, following whatever rules are in place for the flock policy established either by default (implicitly) or changed by you in a call to File::Util::flock_rules() (see the flock_rules()) entry below.

This method will not try to get a lock on the file if the File::Util object was created with the option no_lock or if the method was called with the option no_lock.

This method will automatically call binmode() on binary files for you. If you pass in a filehandle instead of a file name you do not get this automatic check performed for you. In such a case, you'll have to call binmode() on the filehandle yourself. Once you pass a filehandle to this method it has no way of telling if the file opened to that filehandle is binary or not.

binmode => [ boolean or 'utf8' ]

Tell File::Util to read the file in binmode (if set to a true boolean: 1), or to read the file as UTF-8 encoded data, specify a value of utf8 to this option. (see "binmode" in perlfunc).

You need Perl 5.8 or better to use 'utf8' or your program will fail with an error message.

Override the global read limit setting for the File::Util object you are working with, on a one time basis. By specifying a this option with a positive integer value (representing the maximum number of bytes to allow for your load_file() call), you are telling load_file() to ignore the global/default setting for just that call, and to apply your one-time limit of [ positive integer ] bytes on the file while it is read into memory.

Notes: This method does not distinguish between plain files and other file types such as binaries, FIFOs, sockets, etc.

Restrictions imposed by the current "read limit" (see the read_limit()) entry below will be applied to the files opened by this method. Adjust the read limit as necessary either by overriding (using the 'read_limit' option above), or by adjusting the global value for your File::Util object with the provided read_limit() object method.

make_dir

Syntax:make_dir( [new directory name], [bitmask] => { options } )

Attempts to create (recursively) a directory as [new directory name] with the [bitmask] provided. The bitmask is an optional argument and defaults to oct 777, combined with the current user's umask. If specified, the bitmask must be supplied in the form required by the native perl umask function (as an octal number). see "umask" in perlfunc for more information about the format of the bitmask argument.

As mentioned above, the recursive creation of directories is transparently handled for you. This means that if the name of the directory you pass in contains a parent directory that does not exist, the parent directory(ies) will be created for you automatically and silently in order to create the final directory in the [new directory name].

Simply put, if [new directory] is "/path/to/directory" and the directory "/path/to" does not exist, the directory "/path/to" will be created and the "/path/to/directory" directory will be created thereafter. All directories created will be created with the [bitmask] you specify, or with the default of oct 777, combined with the current user's umask.

Upon successful creation of the [new directory name], the [new directory name] is returned to the caller.

Options accepted by make_dir()

if_not_exists => boolean

Example:

$f->make_dir( '/home/jspice' => oct 755 => { if_not_exists => 1 } );

If this option is enabled then make_dir will not attempt to create the directory if it already exists. Rather it will return the name of the directory as it normally would if the directory did not exist previous to calling this method.

If a call to this method is made without the if_not_exists option and the directory specified as [new directory name] does in fact exist, an error will result as it is impossible to create a directory that already exists.

abort_depth

Syntax:abort_depth( [positive integer] )

When called without any arguments, this method returns an integer reflecting the current number of times the File::Util object will dive into the subdirectories it discovers when recursively listing directory contents from a call to File::Util::list_dir(). The default is 1000. If the number is exceeded, the File::Util object will fail with an error.

When called with an argument, it sets the maximum number of times a File::Util object will recurse into subdirectories before failing with an error message.

This method can only be called with a numeric integer value. Passing a bad argument to this method will cause it to fail with an error.

needs_binmode

Syntax:needs_binmode

Returns 1 if the machine on which the code is running requires that binmode()(a built-in function) be called on open file handles, or returns 0 if not. (see "binmode" in perlfunc.) This is a constant method. It accepts no arguments and will always return the same value for the system on which it is executed.

new

Syntax:new( { options } )

This is the File::Util constructor method. It returns a new File::Util object reference when you call it. It recognizes various options that govern the behavior of the new File::Util object.

Parameters accepted by new()

use_flock => boolean

Optionally specify this option to the File::Util::new method to instruct the new object that it should never attempt to use flock() in it's I/O operations. The default is to use flock() if available on your system. Specify this option with a true or false value ( 1 or 0 ), true to use flock(), false to not use it.

read_limit => positive integer

Optionally specify this option to the File::Util::new method to instruct the new object that it should never attempt to open and read in a file greater than the number of bytes you specify. This argument can only be a numeric integer value, otherwise it will be silently ignored. The default read limit for File::Util objects is 52428800 bytes (50 megabytes).

abort_depth => positive integer

Optionally specify this option to the File::Util::new method to instruct the new object to set the maximum number of times it will recurse into subdirectories while performing directory listing operations before failing with an error message. This argument can only be a numeric integer value, otherwise it will be silently ignored.

Set the default policy for how the new File::Util object handles fatal errors. This option takes any one of a list of predefined keywords, or a reference to a named or anonymous error handling subroutine of your own.

You can supply an onfail handler to nearly any function in File::Util, but when you do so for the new() constructor, you are setting the default.

Acceptable values are all covered in the ERROR HANDLING section (above), along with proper syntax and example usage.

onfail

Syntax:onfail( [keyword or code reference] )

Dynamically set/change the default error handling policy for an object.

This works exactly the same as it does when you specify an "onfail" handler to the constructor method (see also new).

The syntax and keywords available to use for this method are already discussed above in the ERROR HANDLING section, so refer to that for in-depth details.

open_handle

Attempts to get a lexically scoped open file handle on [file name] in [mode] mode. Returns the file handle if successful or generates a fatal error with a diagnostic message if the operation fails.

You will need to remember to call close() on the filehandle yourself, at your own discretion. Leaving filehandles open is not a good practice, and is not recommended. see "close" in perlfunc).

Once you have the file handle you would use it as you would use any file handle. Remember that unless you specifically turn file locking off when the File::Util object is created (see new) or by using the no_lock option when calling open_handle, that file locking is going to automagically be handled for you behind the scenes, so long as your OS supports file locking of any kind at all. Great! It's very convenient for you to not have to worry about portability in taking care of file locking between one application and the next; by using File::Util in all of them, you know that you're covered.

A slight inconvenience for the price of a larger set of features (compare write_file to this method) you will have to release the file lock on the open handle yourself.File::Util can't manage it for you anymore once it turns the handle over to you. At that point, it's all yours. In order to release the file lock on your file handle, call unlock_open_handle() on it. Otherwise the lock will remain for the life of your process. If you don't want to use the free portable file locking, remember the no_lock option, which will turn off file locking for your open handle. Seldom, however, should you ever opt to not use file locking unless you really know what you are doing. The only obvious exception would be if you are working with files on a network-mounted filesystem like NFS or SMB (CIFS), in which case locking can be buggy.

If the file does not yet exist it will be created, and it will be created with a bitmask of [bitmask] if you specify a file creation bitmask using the 'bitmask' option, otherwise the file will be created with the default bitmask of oct 777. The bitmask is combined with the current user's umask, whether you specify a value or not. This is a function of Perl, not File::Util.

If specified, the bitmask must be supplied in the form of an octal number as required by the native perl umask function. See "umask" in perlfunc for more information about the format of the bitmask argument. If the file [file name] already exists then the bitmask argument has no effect and is silently ignored.

Any non-existent directories in the path preceding the actual file name will be automatically (and silently - no warnings) created for you and any new directories will be created with a bitmask of [dbitmask], provided you specify a directory creation bitmask with the 'dbitmask' option.

If specified, the directory creation bitmask [dbitmask] must be supplied in the form required by the native perl umask function.

If there is an error while trying to create any preceding directories, the failure results in a fatal error with an error. If all directories preceding the name of the file already exist, the dbitmask argument has no effect and is silently ignored.

Native Perl open modes

The default behavior of open_handle() is to open file handles using Perl's native open()(see "open" in perlfunc). Unless you use the use_sysopen option, only then are the following modes valid:

mode => 'read' (this is the default mode)

[file name] is opened in read-only mode. If the file does not yet exist then a fatal error will occur.

mode => 'write'

[file name] is created if it does not yet exist. If [file name] already exists then its contents are overwritten with the new content provided.

mode => 'append'

[file name] is created if it does not yet exist. If [file name] already exists its contents will be preserved and the new content you provide will be appended to the end of the file.

System level open modes ("open a la C")

Optionally you can ask File::Util to open your handle using CORE::sysopen instead of using the native Perl CORE::open(). This is accomplished by enabling the use_sysopen option. Using this feature opens up more possibilities as far as the open modes you can choose from, but also carries with it a few caveats so you have to be careful, just as you'd have to be a little more careful when using sysopen() anyway.

Specifically you need to remember that when using this feature you must NOT mix different types of I/O when working with the file handle. You can't go opening file handles with sysopen() and print to them as you normally would print to a file handle. You have to use syswrite() instead. The same applies here. If you get a sysopen()'d filehandle from open_handle() it is imperative that you use syswrite() on it. You'll also need to use sysseek() and other type of sys* commands on the filehandle instead of their native Perl equivalents.

That said, here are the different modes you can choose from to get a file handle when using the use_sysopen option. Remember that these won't work unless you use that option, and will generate an error if you try using them without it. The standard 'read', 'write', and 'append' modes are already available to you by default. These are the extended modes:

mode => 'rwcreate'

[file name] is opened in read-write mode, and will be created for you if it does not already exist.

mode => 'rwupdate'

[file name] is opened for you in read-write mode, but must already exist. If it does not exist, a fatal error will result.

mode => 'rwclobber'

[file name] is opened for you in read-write mode. If the file already exists it's contents will be "clobbered" or wiped out. The file will then be empty and you will be working with the then-truncated file. This can not be undone. Once you call open_handle() using this option, your file WILL be wiped out. If the file does not exist yet, it will be created for you.

mode => 'rwappend'

[file name] will be opened for you in read-write mode ready for appending. The file's contents will not be wiped out; they will be preserved and you will be working in append fashion. If the file does not exist, it will be created for you.

Remember to use sysread() and not plain read() when reading those sysopen()'d filehandles!

Options accepted by open_handle()

binmode => [ boolean or 'utf8' ]

Tell File::Util to open the file in binmode (if set to a true boolean: 1), or to open the file with UTF-8 encoding, specify a value of utf8 to this option. (see "binmode" in perlfunc).

You need Perl 5.8 or better to use "utf8" or your program will fail with an error message.

Example Usage:

$ftl->open_handle( 'encoded.txt' => { binmode => 'utf8' } );

no_lock => boolean

By default this method will attempt to get a lock on the file while it is being read, following whatever rules are in place for the flock policy established either by default (implicitly) or changed by you in a call to File::Util::flock_rules() (see flock_rules()).

This method will not try to get a lock on the file if the File::Util object was created with the option no_lock or if this method is called with the option no_lock.

use_sysopen => boolean

Instead of opening the file using Perl's native open() command, File::Util will open the file with the sysopen() command. You will have to remember that your filehandle is a sysopen()'d one, and that you will not be able to use native Perl I/O functions on it. You will have to use the sys* equivalents. See perlopentut for a more in-depth explanation of why you can't mix native Perl I/O with system I/O.

read_limit

Syntax:read_limit( [positive integer] )

By default, the largest size file that File::Util will read into memory and return via the load_file is 52428800 bytes (50 megabytes).

This value can be modified by calling this method with an integer value reflecting the new limit you want to impose, in bytes. For example, if you want to set the limit to 10 megabytes, call the method with an argument of 10485760.

If this method is called without an argument, the read limit currently in force for the File::Util object will be returned.

return_path

Syntax:return_path( [string] )

Takes the file path from the file name provided and returns it such that /who/you/callin/scruffy.txt is returned as /who/you/callin.

This method is optimized for speed and returns anything that could possibly be a file path, even if that means the path is actually foo.bar if you passed it such an argument. Technically, you could indeed have a directory named blaster.txt, so this method doesn't distinguish between strings that look like file names and ones that don't.

If you want one that does, you need to use strict_path() instead. (see strict_path)

size

Syntax:size( [file name] )

Returns the file size of [file name] in bytes. Returns 0 if the file is empty. Returns undef if the file does not exist.

split_path

Syntax:split_path( [string] )

Takes a path/filename, fully-qualified or relative (it doesn't matter), and it returns a list comprising the root of the path (if any), each directory in the path, and the final part of the path (be it a file, a directory, or otherwise)

This method doesn't divine or detect any information about the path, it simply manipulates the string value. It doesn't map it to any real filesystem object. It doesn't matter whether or not the file/path named in the input string exists or not.

strict_path

Syntax:strict_path( [string] )

Works just like return_path() except that it is more strict in what it returns. If you pass it a string that does not "look" like a path (a string with no directory separators or that is not . or ..), then this method will return undef.

If you'd like to get a default path string returned instead of undef, then you want to use the default_path() method instead.

strip_path

Strips the file path from the file name provided and returns the file name only.

Given /kessel/run/12/parsecs, it returns parsecs

Given C:\you\scoundrel, it returns scoundrel

touch

Syntax:touch( [file name] )

Behaves like the *nix touch command; Updates the access and modification times of the specified file to the current time. If the file does not exist, File::Util tries to create it empty. This method will fail with a fatal error if system permissions deny alterations to or creation of the file.

Returns 1 if successful. If unsuccessful, fails with an error.

trunc

Syntax:trunc( [file name] )

Truncates [file name] (i.e.- wipes out, or "clobbers" the contents of the specified file.) Returns 1 if successful. If unsuccessful, fails with a descriptive error message about what went wrong.

unlock_open_handle

Returns true on success, false on failure. Will not raise a fatal error if the unlock operation fails. You can capture the return value from your call to this method and die() if you so desire. Failure is not ever very likely, or File::Util wouldn't have been able to get a portable lock on the file in the first place.

If File::Util wasn't able to ever lock the file due to limitations of your operating system, a call to this method will return a true value.

If file locking has been disabled on the file handle via the no_lock option at the time open_handle was called, or if file locking was disabled using the use_flock method, or if file locking was disabled on the entire File::Util object at the time of its creation (see new()), calling this method will have no effect and a true value will be returned.

use_flock

Syntax:use_flock( [true / false value] )

When called without any arguments, this method returns a true or false value to reflect the current use of flock() within the File::Util object.

When called with a true or false value as its single argument, this method will tell the File::Util object whether or not it should attempt to use flock() in its I/O operations. A true value indicates that the File::Util object will use flock() if available, a false value indicates that it will not. The default is to use flock() when available on your system.

DON'T USE FLOCK ON NETWORK FILESYSTEMS

If you are working with files on an NFS mount, or a Windows file share, it is quite likely that using flock will be buggy and cause unexpected failures in your program. You should not use flock in such situations.

A WORD OF CAUTION FOR SOLARIS USERS

File locking has known issues on SOLARIS. Solaris claims to offer a native flock() implementation, but after obtaining a lock on a file, Solaris will very often just silently refuse to unlock it again until your process has completely exited. This is not an issue with File::Util or even with Perl itself. Other programming languages encounter the same problems; it is a system-level issue. So please be aware of this if you are a Solaris user and want to use file locking on your OS.

Attempts to write [string] to [file name] in mode [mode]. If the file does not yet exist it will be created, and it will be created with a bitmask of [bitmask] if you specify a file creation bitmask using the 'bitmask' option, otherwise the file will be created with the default bitmask of oct 777. The bitmask is combined with the current user's umask, whether you specify a value or not. This is a function of Perl, not File::Util.

[string] should be a string or a scalar variable containing a string. The string can be any type of data, such as a binary stream, or ascii text with line breaks, etc. Be sure to enable the binmode => 1 option for binary streams, and be sure to specify a value of binmode => 'utf8' for UTF-8 encoded data.

NOTE: that you will need Perl version 5.8 or better to use the 'utf8' feature, or your program will fail with an error.

If specified, the bitmask must be supplied in the form of an octal number, as required by the native perl umask function. see "umask" in perlfunc for more information about the format of the bitmask argument. If the file [file name] already exists then the bitmask argument has no effect and is silently ignored.

Returns 1 if successful or fails with an error if not successful.

Any non-existent directories in the path preceding the actual file name will be automatically (and silently - no warnings) created for you and new directories will be created with a bitmask of [dbitmask], provided you specify a directory creation bitmask with the 'dbitmask' option.

If specified, the directory creation bitmask [dbitmask] must be supplied in the form required by the native perl umask function.

If there is a problem while trying to create any preceding directories, the failure results in a fatal error. If all directories preceding the name of the file already exist, the dbitmask argument has no effect and is silently ignored.

mode => 'write' (this is the default mode)

[file name] is created if it does not yet exist. If [file name] already exists then its contents are overwritten with the new content provided.

mode => 'append'

[file name] is created if it does not yet exist. If [file name] already exists its contents will be preserved and the new content you provide will be appended to the end of the file.

Options accepted by write_file()

binmode => [ boolean or 'utf8' ]

Tell File::Util to write the file in binmode (if set to a true boolean: 1), or to write the file with UTF-8 encoding, specify a value of utf8 to this option. (see "binmode" in perlfunc).

You need Perl 5.8 or better to use "utf8" or your program will fail with an error message.

Allows you to call this method without providing a content argument (it lets you create an empty file without warning you or failing. Be advised that if you enable this option, it will have the same effect as truncating a file that already has content in it (i.e.- it will "clobber" non-empty files)

no_lock => boolean

By default this method will attempt to get a lock on the file while it is being read, following whatever rules are in place for the flock policy established either by default (implicitly) or changed by you in a call to File::Util::flock_rules() (see flock_rules()).

This method will not try to get a lock on the file if the File::Util object was created with the option no_lock or if this method is called with the option no_lock enabled.

valid_filename

Syntax:valid_filename( [string] )

For the given string, returns 1 if the string is a legal file name for the system on which the program is running, or returns undef if it is not. This method does not test for the validity of file paths! It tests for the validity of file names only. (It is used internally to check beforehand if a file name is usable when creating new files, but is also a public method available for external use.)

CONSTANTS

NL

Syntax:NL

Short for "New Line". Returns the correct new line character (or character sequence) for the system on which your program runs.

SL

Syntax:SL

Short for "Slash". Returns the correct directory path separator for the system on which your program runs.

OS

Syntax:OS

Returns the File::Util keyword for the operating system FAMILY it detected. The keyword for the detected operating system will be one of the following, derived from the contents of $^O, or if $^O can not be found, from the contents of $Config::Config{osname} (see native Config library), or if that doesn't contain a recognizable value, finally falls back to UNIX.

Generally speaking, Linux operating systems are going to be detected as UNIX. This isn't a bug. The OS FAMILY to which it belongs uses UNIX style filesystem conventions and line endings, which are the relevant things to file handling operations.

UNIX

Specifics: OS name =~ /^(?:darwin|bsdos)/i

CYGWIN

Specifics: OS name =~ /^cygwin/i

WINDOWS

Specifics: OS name =~ /^MSWin/i

VMS

Specifics: OS name =~ /^vms/i

DOS

Specifics: OS name =~ /^dos/i

MACINTOSH

Specifics: OS name =~ /^MacOS/i

EPOC

Specifics: OS name =~ /^epoc/i

OS2

Specifics: OS name =~ /^os2/i

AUTHORS

COPYRIGHT

Copyright(C) 2001-2013, Tommy Butler. All rights reserved.

LICENSE

This library is free software, you may redistribute it and/or modify it under the same terms as Perl itself. For more details, see the full text of the LICENSE file that is included in this distribution.

LIMITATION OF WARRANTY

This software is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.