\&output is a reference to a callback function which outputs the records returned by the servers. Basically, the callback function gets the records in the form of an array, in which each element of the array is a line of the record. At the simplest level, you just loop through the array, printing each line and a newline.

Here we set options which apply to individual servers in the @options array. asyncZOptions returns a reference to a Net::Z3950::AsyncZ::Options::_params object; we can pass into it options we want to set for individual servers. We have not defined a _params object for library.anu.edu.au, so a default _params will be created for it.

As you can see, we can set different queries for different servers; we can set separate logs, assuming we want to track errors separately-- we can even suppress error reporting on an individual basis. In the case of 'amicus', we have asked that the preferredRecordSyntax be set to Net::Z3950::RecordSyntax::GRS1, since the Natonal Library of Canada uses GRS-1 as its default output; we could also have done that in the call to asyncZOptions:

asyncZOptions(preferredRecordSyntax=>Net::Z3950::RecordSyntax::GRS1);

In addition to detailed logging of error messages, there's also error reporting aimed at the user, to inform users when records haven't been returned. See "Errors" below.

Net::Z3950::AsyncZ adds an additional layer of asynchronous support for the Z3950 module through the use of multiple forked processes. Users may also find that it provides a convenient front end to Z3950.

My own experience with Z3950 async mode was that I could connect to servers and get back the number of records waiting to be fetched, but I was unable to retrieve the records themselves.

The Z3950 documentation talks about this situation:

when the connection is anychronous, the errcode() may
be zero, indicating simply that the record has not yet been fetched from
the server. In this case, the calling code should try again later. (How
much later? As a rule of thumb, after it's done ``something else'', such
as request another record or issue another search.)

The documentation promises to provide user code for asynchronous access at a later date, and since synchronous access is apparently written on top of asynchronous code, the techniques for the async mode no doubt exist. But I searched the mailing list archive and couldn't find anything relevant. So, at the risk of carrying coals to Newcastle, I wrote AsyncZ.

AsyncZ forks off maxpipes processes at a time. After these processes have returned and reported their results, or after a timeout period, the next set of maxpipes are forked off, and so forth. An Event loop is set in motion that enables AsyncZ to wait for results--either records or error messages--to return from the Z39.50 servers. Records are passed through, in the order in which they arrive, to a callback function (cb), which you supply and which outputs the records.

Each of the forked processes, in turn, runs in its own Event loop while waiting for results to return from the server. The two-fold purpose of these loops, local to each forked process, is:

[1] to help insure that a request to a server doesn't get swallowed up on the network and never return, causing a script or program to hang;

[2] to set a timeout on how long you are prepared to wait for a response.

The loop in the child process is not always enough in itself to prevent a script from hanging; for such cases you can set a monitor which will kill the main process after a timeout period. See the discussion of monitor in Options.pod.

The loop in the child process is not always enough in itself to prevent a script from
hanging; for such cases you can set a monitor which will kill the main process
after a timeout period. See the discussion of monitor
in Options.html.

Various conditions may be responsible for the failure to receive records from a server. In some circumstances, such as timing out, it may be worth a second try. In such cases AsyncZ will try the server a second time. (I refer to these two tries as two cycles.)

The constructor does not return a reference to Net::Z3950::AsyncZ until this two cycle process is completed. This reference gives you access to any errors which may have been reported, i.e. you can check to see why a server has not returned any records and provide error messages to the user as you see fit. In addition, you can keep an Error log with considerably more detailed error reporting; you can in fact keep a separate log for any one or combination of the servers you contact.

Everything essentially proceeds from the constructor. Once you provide the constructor with a list of servers and a query (or queries), and a callback function to output your records, you have nothing to do except wait for the reference which gives you access to the error messages. You can exercise a great deal of control by setting options for both the parent process and any or all of its children.

You will notice that I have retained the @servers array used in Mike Taylor's sample scripts for the Net::Z3950 module, i.e. an array of references to 3-element arrays of servers, ports, and databases.

When you run this script at the terminal, you will find several types of headers and detailed error messages interspersed with the query results. For a "clean" output see basic_pretty.pl, which is included in the distribution.

my $asyncZ = Net::Z3950::AsyncZ->new(
servers=>\@servers, # array of references to servers in form: [ $host, $port, $database]
query=>$query, # format depends on Z3950 querytype: defaults to 'prefix'
timeout=>25, # total timeout in seconds for all processes
timeout_min=>5, # minumum timeout in secs to exit event loop if all processes are finished
interval=>1, # Event loop timer interval
maxpipes => 4, # maximum number of forks to be executed at one time
log=>undef, # undef, name of log file to which extended error messages are written
# or Net::Z3950::AsyncZ::Errors::suppressErrors()
cb=>\&cb, # callback function to which records will be sent as available
format=>\&format, # callback function to format individual lines of records
num_to_fetch=>$num, # number of records to fetch from each server
options=>\@options, # array of references to Net::Z3950::AsyncZ::Options::_params objects
monitor => 0 # timeout in seconds for a monitoring child process: if
# 0 no monitor is created
);

AsyncZ::new() takes a set of named parameters. Some of them, like maxpipes and timeout apply to the overall functioning of Net::Z3950::AsyncZ, i.e. to the parent process. Others, like num_to_fetch and format can be set individually for each server in the servers array, i.e. for each child process. Settings for the child processes are made using the options parameter and the Net::Z3950::AsyncZ::Options::_params array. If a _params object does not exist for a child process, one is automatically created using default values. The indices of the _params array must be synchronized with the indices of the servers array.

For every query sent to a server you must supply three required parameters: servers, query, and cb. That is, you must supply an array reference to the server's $host, $port, and $database, you must supply the the query itself, and finally a callback function, which is responsible for outputting the data returned from the Z39.50 server. This is the minimal configuration, the one shown above in "The Basic Script".

The optional parameters have either default values or default behaviors. Some of the optional parameters are exclusive to the functioning of the parent process, for instance timeout and interval. Others are for use only in the child processes, for instance format and num_to_fetch, while log is used in both the parent and its children.

$error_number: the Maximum number of possible errors which have occurred for all servers during current session; because of the two-cycle process, some errors reported in the first cycle are nullified by successful outcomes during the second cycle; the class method isZ_Error() tests for whether a cycle 1 error has been nullified by a successful second attempt. See Net::Z3950::AsyncZ::isZ_Error.

an optional list of named parameters which set the options for a child process. When called without parameters, the _params object is created with a set of default values. Unless you plan to override the default values, it's not necessary to call asyncZOptions: AsyncZ.pm will create a default _params object for you.

There is a full range of accessor methods by which each option can be set and queried in the form of $params_ref->set_option_1(value) and $value=$params_ref->get_option_1(). This makes it possible to set options dynamically.

$param_ref: reference to a Net::Z3950::AsyncZ::Options::_params object.

Net::Z3950::AsyncZ::Options::_params objects are used internally by AsyncZ and hence treated as private. Creating a _params object directly by calling its new method is not recommended. See Net::Z3950::AsyncZ::Options::_params

$bool: true if header $line designates that current record is of <TYPE>, otherwise false

These utilities test for the type of record which is currently being presented to the callback function. Each record is sent to the callback prefaced with headers that provide information about the record, including its type. If you are querying a variety of servers, some might send back MARC records, others GRS-1.

See also Net::Z3950::AsyncZ::isZ_Header which tests for whether a $line is a type-header, as opposed to whether it designates a particular type of record

Records are sent to the callback function as an array of lines in which records are separated from one other by a set of headers; you can determine the number of the current record by extracting the record number from its type-header using getZ_RecNum. See "Headers" and "getZ_RecNum".

$recnum: The number of the current record in the Record Set, i.e. if there are 20 records matching the query, and you have asked for 5 at time, the record number is not one of five, but one of 20. You must first test the line to make sure it is a header:

$err_array_ref: an array reference returned by Net::AscyncZ::getErrors (the array holds two Net::Z3950::AsyncZ::ErrMsg objects).

Because of the two-cycle process, some errors reported in the first cycle are nullified by successful outcomes during the second cycle; this method tests for whether a cycle 1 error has been nullified by a successful second attempt.

In other words, it returns false if there has been no error and true if there has been. The type of true value it returns is used by Net::Z3950::AsyncZ::isZ_nonRetryable to determine whether this error was non-recoverable.

This is a convenience method in which the idiom isZ_nonRetryable(isZ_Error($err)) tests whether $err is a non-recoverable cycle 1 error. Since such errors often occur at the system level, this enables you to side-step outputting what might be gobbledygook (e.g. "illegal seek") to the user:

print "There has been an error in contacting this server\n"
if isZ_nonRetryable(isZ_Error($err));

Since there are some non-recoverable cycle 1 errors which might be of interest to the user (e.g. "connection refused", which is identified as a network error), you might test whether it is also a system error:

print "There has been an error in contacting this server\n"
if isZ_nonRetryable(isZ_Error($err)) && $err->isSystem();

$line: either string or reference to string, depending on whether a reference or a string was intially passed in paramter $_[0].

These functions are used internally by AsyncZ but they can be a useful supplement to isZ_Header,isZ_Server, and isZ_PID; instead of testing for these headers, they enable you to either delete or substitute another string for them.

You might, for instance, find it useful to substitute the name of an institution for the name of a server:

For the record: A callback is a function which you supply and which AsyncZ calls upon as required.

AsyncZ uses two callback functions. One handles the general output of records fetched from the servers queried. The second formats individual lines of the record to your specifications. The format callback is not required.

Note: It is important to note the sequence in which the parameters are passed to the callback:

my($index, $array_ref) = @_;

The array which is referenced by $array_ref contains all of the records fetched from the current server. Each element of the array holds either one line of the record or one of the AsyncZ headers. The headers separate the records, while the format of the record and its lines depends up two factors:

The first three lines of each record are headers, indicating that you have encountered a new record. The headers hold the following information:

Server name
pid of child process
type of record and record number.

At the very least you would probably want to ignore the headers and add a newline to separate one record from another. The set of class methods provided by Net::Z3950::AsyncZ allows you to deal with the headers as you see fit: you can ignore them, you can identify the record type and extract the record number, and you can extract the server name.

If a server fails to return any records, the array will consist of one line of the following form:

{!-- library.anu.edu.au --}

This line does not tell us which server has failed, only that one of the child processes has not returned any records.

While the server's name is given in the headers to each record, knowing the $index will enable you to track the servers you've queried. For instance, you might want to create an array with the names of the institutions at which servers are located, so that you can tell your users that the current record is a response from Acadia University in Wolfville, N.S., rather from jasper.acadiau.ca. Knowing the index in the callback enables you to do this.

See "Headers" and basic_pretty.pl, included with the distribution, for some ways of testing for and handling headers.

In the HTML each field is placed within a <td>. It would then be up to you, in your output callback, to complete the HTML by adding the <TABLE>. . .</TABLE> tags and any attributes to those tags. You could also, for instance, format the table using CSS.

The functions which create this output are in Net::Z3950::AsyncZ::Report:

You can specify your own row formatter using the format parameter of AsyncZ's constructor. It will always be passed the reference to a two element array, but if there is no MARC tag, then $row-[0]> will be set to the null string and $row-[1]> will hold whatever data is available.

Tip: The default row formatter is _defaultRecordRow. To make _defaultRecordRowHTML your default, set the constructor's format parameter to Net::Z3950::AsyncZ:Report::_defaultRecordRowHTML:

This fourth header tells us that one of the servers failed to return records--but not which one failed. library.anu.edu.au is not the server which failed to respond but the last server which did respond. (The reasons for this have to do with asynchononicity and shared memory.)

This may be useful when you are requesting additional records for the same query. If you are getting 5 records at a time, in your second request to the server, the first of the records returned would be number 6.

If you wanted toget rid of the MARC tags and the following white space you could put each line through this filter:

The detailed messages contain a number of different kinds of information:

1. a trace back 3 levels
2. server name and query string
3. Z3950 error messages where available
4. system error messages

Detailed errors are either sent to a file or to the terminal or are suppressed. How they are dealt with depends on the log options of Net::AsnyncZ::new and Net::Z3950::AsyncZ::Options::_params. This means that you can have different error reporting mechanisms for each of your servers as well as for the parent process.

The default behavior is to write all error messages to the terminal. To write them to a log file you set log to a filename:

log=>$filespec

NOTE: Do not open the file yourself. All files are automatically opened and closed by AsyncZ.

To suppress all errors you do the following:

log=>Net::Z3950::AsyncZ::Errors::suppressErrors()

Since suppressErrors() is exported, you can do this:

use Net::Z3950::AsyncZ::Errors(suppressErrors);
log=>suppressErrors()

System error messages and Perl library messages are routinely sent to STDERR; AsyncZ sends its error messages to STDOUT. This means that if you don't do do something to redirect the AsyncZ messages and you are operating in a web browser, the AsyncZ messages will go to the browser.

AsyncZ keeps a record of which processes have returned records and which have not. It also keeps track of the exit codes of each process. For each process which has not returned records,it creates a Net::Z3950::AsyncZ::ErrMsg object, based on its exit code. There is a separate set of Net::Z3950::AsyncZ::ErrMsg objects for each of the two AsyncZ cycles (See "The Basic Mechanisms of Net::Z3950::AsyncZ"). A query which reported failure in the first cycle may have been successful in its second attempt. Net::Z3950::AsyncZ::isZ_Error returns true if a server has not returned any records, false if it has.

This applies to two cases: [1] EAGAIN: the system error which returns a "try again" message [2] a process which has been created but never gets far enough to return an exit code, presumably because it has timed out.

An Unspecified error is generally one which has been reported by the system but which I have not included among the errors worth reporting back to ordinary users. (You will, however, find them reported in the log file.) Even some of the errors which I do list might not be worth reporting back to the user (usually those answer true to isZ_nonRetryable.)