Description

The Solaris fault management daemon (fmd) is the central point in Solaris
for fault management. It receives observations from various sources and delivers them
to subscribing diagnosis engines; if those diagnosis engines diagnose a problem, the
fault manager publishes additional protocol events to track the problem lifecycle from initial
diagnosis through repair and final problem resolution. The event protocol is specified
in the Sun Fault Management Event Protocol Specification. The interfaces described here
allow an external process to subscribe to protocol events. See the Fault Management
Daemon Programmer's Reference Guide for additional information on fmd.

The fmd module API (not a Committed interface) allows plugin modules to
load within the fmd process, subscribe to events of interest, and participate
in various diagnosis and response activities. Of those modules, some are notification
agents and will subscribe to events describing diagnoses and their subsequent lifecycle and
render these to console/syslog (for the syslog-msgs agent) and via SNMP trap
and browsable MIB (for the snmp-trapgen module and the corresponding dlmod for
the SNMP daemon). It has not been possible to subscribe to protocol events
outside of the context of an fmd plugin. The libfmevent interface provides
this external subscription mechanism. External subscribers may receive protocol events as fmd
modules do, but they cannot participate in other aspects of the fmd
module API such as diagnosis. External subscribers are therefore suitable as notification agents
and for transporting fault management events.

Fault Management Protocol Events

This protocol is defined in the Sun Fault Management Event Protocol Specification.
Note that while the API described on this manual page are Committed,
the protocol events themselves (in class names and all event payload) are
not Committed along with this API. The protocol specification document describes the commitment
level of individual event classes and their payload content. In broad terms,
the list.* events are Committed in most of their content and semantics
while events of other classes are generally Uncommitted with a few exceptions.

All protocol events include an identifying class string, with the hierarchies defined
in the protocol document and individual events registered in the Events Registry.
The libfmevent mechanism will permit subscription to events with Category 1 class
of “list” and “swevent”, that is, to classes matching patterns “list.*” and “swevent.*”.

All protocol events consist of a number of (name, datatype, value) tuples
(“nvpairs”). Depending on the event class various nvpairs are required and have
well-defined meanings. In Solaris fmd protocol events are represented as name-value lists
using the libnvpair(3LIB) interfaces.

API Overview

The API is simple to use in the common case (see Examples),
but provides substantial control to cater for more-complex scenarios.

We obtain an opaque subscription handle using fmev_shdl_init(), quoting the ABI version
and optionally nominating alloc(), zalloc() and free() functions (the defaults use the
umem family). More than one handle may be opened if desired. Each handle
opened establishes a communication channel with fmd, the implementation of which is
opaque to the libfmevent mechanism.

On a handle we may establish one or more subscriptions using fmev_shdl_subscribe().
Events of interest are specified using a simple wildcarded pattern which is
matched against the event class of incoming events. For each match that
is made a callback is performed to a function we associate with the
subscription, passing a nominated cookie to that function. Subscriptions may be dropped
using fmev_shdl_unsubscribe() quoting exactly the same class or class pattern as was
used to establish the subscription.

Each call to fmev_shdl_subscribe() creates a single thread dedicated to serving callback
requests arising from this subscription.

An event callback handler has as arguments an opaque event handle, the
event class, the event nvlist, and the cookie it was registered with
in fmev_shdl_subscribe(). The timestamp for when the event was generated (not when
it was received) is available as a struct timespec with fmev_timespec(), or more
directly with fmev_time_sec() and fmev_time_nsec(); an event handle and struct tm can also
be passed to fmev_localtime() to fill the struct tm. A high-resolution timestamp for
an event may be retrieved using fmev_hrtime(); this value has the semantics
described in gethrtime(3C).

The event handle, class string pointer, and nvlist_t pointer passed as arguments
to a callback are valid for the duration of the callback. If
the application wants to continue to process the event beyond the duration
of the callback then it can hold the event with fmev_hold(), and
later release it with fmev_rele(). When the reference count drops to zero
the event is freed.

Error Handling

In <libfmevent.h> an enumeration fmev_err_t of error types is defined. To render
an error message string from an fmev_err_t use fmev_strerror(). An fmev_errno is
defined which returns the error number for the last failed libfmevent API call
made by the current thread. You may not assign to fmev_errno.

If a function returns type fmev_err_t, then success is indicated by FMEV_SUCCESS
(or FMEV_OK as an alias); on failure a FMEVERR_* value is returned
(see <fm/libfmevent.h>).

If a function returns a pointer type then failure is indicated by
a NULL return, and fmev_errno will record the error type.

Subscription Handles

A subscription handle is required in order to establish and manage subscriptions.
This handle represents the abstract communication mechanism between the application and the
fault management daemon running in the current zone.

A subscription handle is represented by the opaque fmev_shdl_t datatype. A handle
is initialized with fmev_shdl_init() and quoted to subsequent API members.

To simplify usage of the API, subscription attributes for all subscriptions established
on a handle are a property of the handle itself ; they
cannot be varied per-subscription. In such use cases multiple handles will need
to be used.

libfmevent ABI version

The first argument to fmev_shdl_init() indicates the libfmevent ABI version with which
the handle is being opened. Specify either LIBFMEVENT_VERSION_LATEST to indicate the most
recent version available at compile time or LIBFMEVENT_VERSION_1 (_2, etc. as the
interface evolves) for an explicit choice.

Interfaces present in an earlier version of the interface will continue to
be present with the same or compatible semantics in all subsequent versions.
When additional interfaces and functionality are introduced the ABI version will be
incremented. When an ABI version is chosen in fmev_shdl_init(), only interfaces introduced
in or before that version will be available to the application via
that handle. Attempts to use later API members will fail with FMEVERR_VERSION_MISMATCH.

This manual page describes LIBFMEVENT_VERSION_1.

Privileges

The libfmevent API is not least-privilege aware; you need to have all
privileges to call fmev_shdl_init(). Once a handle has been initialized with fmev_shdl_init()
a process can drop privileges down to the basic set and continue
to use fmev_shdl_subscribe() and other libfmevent interfaces on that handle.

Underlying Event Transport

The implementation of the event transport by which events are published from
the fault manager and multiplexed out to libfmevent consumers is strictly private.
It is subject to change at any time, and you should not
encode any dependency on the underlying mechanism into your application. Use only the
API described on this manual page and in <libfmevent.h>.

The underlying transport mechanism is guaranteed to have the property that a
subscriber may attach to it even before the fault manager is running.
If the fault manager starts first then any events published before the
first consumer subscribes will wait in the transport until a consumer appears.

The underlying transport will also have some maximum depth to the queue
of events pending delivery. This may be hit if there are no
consumers, or if consumers are not processing events quickly enough. In practice
the rate of events is small. When this maximum depth is reached
additional events will be dropped.

The underlying transport has no concept of priority delivery; all events are
treated equally.

Subscription Handle Initialization

Obtain a new subscription handle with fmev_shdl_init(). The first argument is
the libfmevent ABI version to be used (see above). The remaining
three arguments should be all NULL to leave the library to use
its default allocator functions (the libumem family), or all non-NULL to appoint wrappers
to custom allocation functions if required.

FMEVERR_VERSION_MISMATCH

The library does not support the version requested.

FMEVERR_ALLOC

An error occurred in trying to allocate data structures.

FMEVERR_API

The alloc(), zalloc(), or free() arguments must either be all NULL or all non-NULL.

FMEVERR_NOPRIV

Insufficient privilege to perform operation. In version 1 root privilege is required.

FMEVERR_INTERNAL

Internal library error.

Fault Manager Authority Information

Once a subscription handle has been initialized, authority information for the fault
manager to which the client is connected may be retrieved with fmev_shdl_getauthority().
The caller is responsible for freeing the returned nvlist using nvlist_free(3NVPAIR).

Subscription Handle Finalization

Close a subscription handle with fmev_shdl_fini(). This call must not be performed
from within the context of an event callback handler, else it will
fail with FMEVERR_API.

The fmev_shdl_fini() call will remove all active subscriptions on the handle and
free resources used in managing the handle.

FMEVERR_API

May not be called from event delivery context for a subscription on the same handle.

Subscribing To Events

To establish a new subscription on a handle, use fmev_shdl_subscribe(). Besides
the handle argument you provide the class or class pattern to subscribe
to (the latter permitting simple wildcarding using '*'), a callback function pointer
for a function to be called for all matching events, and a cookie
to pass to that callback function.

The class pattern must match events per the fault management protocol specification,
such as “list.suspect” or “list.*”. Patterns that do not map onto
existing events will not be rejected - they just won't result in
any callbacks.

A callback function has type fmev_cbfunc_t. The first argument is an
opaque event handle for use in event access functions described below.
The second argument is the event class string, and the third argument
is the event nvlist; these could be retrieved using fmev_class() and fmev_attr_list()
on the event handle, but they are supplied as arguments for convenience.
The final argument is the cookie requested when the subscription was established
in fmev_shdl_subscribe().

Each call to fmev_shdl_subscribe() opens a new door into the process that
the kernel uses for event delivery. Each subscription therefore uses one
file descriptor in the process.

See below for more detail on event callback context.

FMEVERR_API

Class pattern is NULL or callback function is NULL.

FMEVERR_BADCLASS

Class pattern is the empty string, or exceeds the maximum length of FMEV_MAX_CLASS.

FMEVERR_ALLOC

An attempt to fmev_shdl_zalloc() additional memory failed.

FMEVERR_DUPLICATE

Duplicate subscription request. Only one subscription for a given class pattern may exist on a handle.

FMEVERR_MAX_SUBSCRIBERS

A system-imposed limit on the maximum number of subscribers to the underlying transport mechanism has been reached.

FMEVERR_INTERNAL

An unknown error occurred in trying to establish the subscription.

Unsubscribing

An unsubscribe request using fmev_shdl_unsubscribe() must exactly match a previous subscription request
or it will fail with FMEVERR_NOMATCH. The request stops further callbacks
for this subscription, waits for any existing active callbacks to complete, and drops
the subscription.

Do not call fmev_shdl_unsubscribe from event callback context, else it will fail
with FMEVERR_API.

FMEVERR_API

A NULL pattern was specified, or the call was attempted from callback context.

FMEVERR_NOMATCH

The pattern provided does not match any open subscription. The pattern must be an exact match.

FMEVERR_BADCLASS

The class pattern is the empty string or exceeds FMEV_MAX_CLASS.

Event Callback Context

Event callback context is defined as the duration of a callback event,
from the moment we enter the registered callback function to the moment
it returns. There are a few restrictions on actions that may be
performed from callback context:

You can perform long-running actions, but this thread will not be available to service other event deliveries until you return.

You must not cause the current thread to exit.

You must not call either fmev_shdl_unsubscribe() or fmev_shdl_fini() for the subscription handle on which this callback has been made.

You can invoke fork(), popen(), etc.

Event Handles

A callback receives an fmev_t as a handle on the associated event.
The callback may use the access functions described below to retrieve various
event attributes.

By default, an event handle fmev_t is valid for the duration of
the callback context. You cannot access the event outside of callback context.

If you need to continue to work with an event beyond the
initial callback context in which it is received, you may place a
“hold” on the event with fmev_hold(). When finished with the event, release
it with fmev_rele(). These calls increment and decrement a reference count on the
event; when it drops to zero the event is freed. On initial
entry to a callback the reference count is 1, and this is
always decremented when the callback returns.

An alternative to fmev_hold() is fmev_dup(), which duplicates the event and returns
a new event handle with a reference count of 1. When fmev_rele()
is applied to the new handle and reduces the reference count to
0, the event is freed. The advantage of fmev_dup() is that it allocates
new memory to hold the event rather than continuing to hold a
buffer provided by the underlying delivery mechanism. If your operation is going
to be long-running, you may want to use fmev_dup() to avoid starving the
underlying mechanism of event buffers.

Given an fmev_t, a callback function can use fmev_ev2shdl() to retrieve the
subscription handle on which the subscription was made that resulted in this
event delivery.

The fmev_hold() and fmev_rele() functions always succeed.

The fmev_dup() function may fail and return NULL with fmev_errno of:

FMEVERR_API

A NULL event handle was passed.

FMEVERR_ALLOC

The fmev_shdl_alloc() call failed.

Event Class

A delivery callback already receives the event class as an argument, so
fmev_class() will only be of use outside of callback context (that is,
for an event that was held or duped in callback context and
is now being processed in an asynchronous handler). This is a convenience function
that returns the same result as accessing the event attributes with fmev_attr_list()
and using nvlist_lookup_string(3NVPAIR) to lookup a string member of name “class”.

The string returned by fmev_class() is valid for as long as the
event handle itself.

The fmev_class() function may fail and return NULL with fmev_errno of:

FMEVERR_API

A NULL event handle was passed.

FMEVERR_MALFORMED_EVENT

The event appears corrupted.

Event Attribute List

All events are defined as a series of (name, type) pairs. An
instance of an event is therefore a series of tuples (name, type,
value). Allowed types are defined in the protocol specification. In Solaris, and
in libfmevent, an event is represented as an nvlist_t using the libnvpair(3LIB) library.

The nvlist of event attributes can be accessed using fmev_attr_list(). The resulting
nvlist_t pointer is valid for the same duration as the underlying event
handle. Do not use nvlist_free() to free the nvlist. You may then
lookup members, iterate over members, and so on using the libnvpair interfaces.

The fmev_attr_list() function may fail and return NULL with fmev_errno of:

FMEVERR_API

A NULL event handle was passed.

FMEVERR_MALFORMED_EVENT

The event appears corrupted.

Event Timestamp

These functions refer to the time at which the event was originally
produced, not the time at which it was forwarded to libfmevent or
delivered to the callback.

Use fmev_timespec() to fill a struct timespec with the event time in seconds
since the Epoch (tv_sec, signed integer) and nanoseconds past that second (tv_nsec,
a signed long). This call can fail and return FMEVERR_OVERFLOW if the
seconds value will not fit in a signed 32-bit integer (as used
in struct timespectv_sec).

You can use fmev_time_sec() and fmev_time_nsec() to retrieve the same second and
nanosecond values as uint64_t quantities.

The fmev_localtime function takes an event handle and a struct tm pointer and
fills that structure according to the timestamp. The result is suitable for
use with strftime(3C). This call will return NULL and fmev_errno of FMEVERR_OVERFLOW
under the same conditions as above.

FMEVERR_OVERFLOW

The fmev_timespec() function cannot fit the seconds value into the signed long integer tv_sec member of a struct timespec.

String Functions

A string can be duplicated using fmev_shdl_strdup(); this will allocate memory for
the copy using the allocator nominated in fmev_shdl_init(). The caller is responsible
for freeing the buffer using fmev_shdl_strfree(); the caller can modify the duplicated string
but must not change the string length.

An FMRI retrieved from a received event as an nvlist_t may be
rendered as a string using fmev_shdl_nvl2str(). The nvlist must be a legal
FMRI (recognized class, version and payload), or NULL is returned with fmev_errno()
of FMEVERR_INVALIDARG. The formatted string is rendered into a buffer allocated
using the memory allocation functions nominated in fmev_shdl_init(), and the caller is
responsible for freeing that buffer using fmev_shdl_strfree().

Memory Allocation

The fmev_shdl_alloc(), fmev_shdl_zalloc(), and fmev_shdl_free() functions allocate and free memory using the
choices made for the given handle when it was initialized, typically the
libumem(3LIB) family if all were specified NULL.

Subscription Handle Control

The fmev_shdlctl_*() interfaces offer control over various properties of the subscription handle,
allowing fine-tuning for particular applications. In the common case the default handle
properties will suffice.

These properties apply to the handle and uniformly to all subscriptions made
on that handle. The properties may only be changed when there are
no subscriptions in place on the handle, otherwise FMEVERR_BUSY is returned.

Event delivery is performed through invocations of a private door. A new
door is opened for each fmev_shdl_subscribe() call. These invocations occur in the
context of a single private thread associated with the door for a
subscription. Many of the fmev_shdlctl_*() interfaces are concerned with controlling various aspects of
this delivery thread.

If you have applied fmev_shdlctl_thrcreate(), “custom thread creation semantics” apply on the
handle; otherwise “default thread creation semantics” are in force. Some fmev_shdlctl_*() interfaces
apply only to default thread creation semantics.

The fmev_shdlctl_serialize() control requests that all deliveries on a handle, regardless of
which subscription request they are for, be serialized - no concurrent deliveries
on this handle. Without this control applied deliveries arising from each subscription
established with fmev_shdl_subscribe() are individually single-threaded, but if multiple subscriptions have been established
then deliveries arising from separate subscriptions may be concurrent. This control applies
to both custom and default thread creation semantics.

The fmev_shdlctl_thrattr() control applies only to default thread creation semantics. Threads that
are created to service subscriptions will be created with pthread_create(3C) using the
pthread_attr_t provided by this interface. The attribute structure is not copied and
so must persist for as long as it is in force on
the handle.

The default thread attributes are also the minimum requirement: threads must be
created PTHREAD_CREATE_DETACHED and PTHREAD_SCOPE_SYSTEM. A NULL pointer for the pthread_attr_t will reinstate
these default attributes.

The fmev_shdlctl_sigmask() control applies only to default thread creation semantics. Threads that
are created to service subscriptions will be created with the requested signal
set masked - a pthread_sigmask(3C) request to SIG_SETMASK to this mask prior
to pthread_create(). The default is to mask all signals except SIGABRT.

See door_xcreate(3C) for a detailed description of thread setup and creation functions
for door server threads.

The fmev_shdlctl_thrsetup() function runs in the context of the newly-created thread before
it binds to the door created to service the subscription. It is
therefore a suitable place to perform any thread-specific operations the application may
require. This control applies to both custom and default thread creation semantics.

Using fmev_shdlctl_thrcreate() forfeits the default thread creation semantics described above. The function
appointed is responsible for all of the tasks required of a door_xcreate_server_func_t
in door_xcreate().

The fmev_shdlctl_*() functions may fail and return NULL with fmev_errno of:

FMEVERR_BUSY

Subscriptions are in place on this handle.

Examples

Example 1 Subscription example

The following example subscribes to list.suspect events and prints out a simple
message for each one that is received. It foregoes most error
checking for the sake of clarity.