Sockets handle two completely different types of I/O, each with attendant pitfalls and benefits. The normal Perl I/O functions used on files (except for
seek
and
sysseek
) work for stream sockets, but datagram sockets require the system calls
send
and
recv
, which work on complete records.

Awareness of
buffering issues is particularly important in socket programming. That's because buffering, while designed to enhance performance, can interfere with the interactive feel that some programs require. Gathering input with < > may try to read more data from the socket than is yet available as it looks for a record separator. Both
print
and < > use
stdio
buffers, so unless you've changed autoflushing (see the Introduction to
Chapter 7,
File Access
) on the socket handle, your data won't be sent to the other end as soon as you
print
it. Instead, it will wait until a buffer fills up.

For line-based clients and servers, this is probably okay, so long as you turn on autoflushing for output. Newer versions of IO::Socket do this automatically on the anonymous filehandles returned by
IO::Socket->new
.

But stdio isn't the only source of buffering. Output (
print,
printf
, or
syswrite
- or
send
on a TCP socket) is further subject to buffering at the operating system level under a strategy called
The Nagle Algorithm
. When a packet of data has been sent but not acknowledged, further to-be-sent data is queued and is sent as soon as another complete packet's worth is collected or the outstanding acknowledgment is received. In some situations (mouse events being sent to a windowing system, keystrokes to a real-time application) this buffering is inconvenient or downright wrong. You can disable the Nagle Algorithm with the
TCP_NODELAY socket option:

In most cases, TCP_NODELAY isn't something you need. TCP buffering is there for a reason, so don't disable it unless your application is one of the few real-time packet-intensive situations that need to.

Load in TCP_NODELAY from
sys/socket.ph
, a file that isn't automatically installed with Perl, but can be easily built. See
Recipe 12.14
for details.

Because buffering is such an issue, you have the
select
function to determine which filehandles have unread input, which can be written to, and which have "exceptional conditions" pending. The
select
function takes three strings interpreted as binary data, each bit corresponding to a filehandle. A typical call to
select
looks like this:

The four arguments to
select
are: a bitmask indicating which filehandles to check for unread data; a bitmask indicating which filehandles to check for safety to write without blocking; a bitmask indicating which filehandles to check for exceptional conditions on; and a time in seconds indicating the maximum time to wait (this can be a floating point number).

The function changes the bitmask arguments passed to it, so that when it returns, the only bits set correspond to filehandles ready for I/O. This leads to the common strategy of assigning an input mask (
$rin
above) to an output one (
$rout
about), so that
select
can only affect
$rout
, leaving
$rin
alone.

You can specify a timeout of 0 to
poll
(check without blocking). Some beginning programmers think that blocking is bad, so they write programs that "busy wait" - they poll and poll and poll and poll. When a program blocks, the operating system recognizes that the process is pending on input and gives CPU time to other programs until input is available. When a program busy-waits, the system can't let it sleep because it's always doing something - checking for input! Occasionally, polling is the right thing to do, but far more often it's not. A timeout of
undef
to
select
means "no timeout," and your program will patiently block until input becomes available.

Because
select
uses bitmasks, which are tiresome to create and difficult to interpret, we use the standard IO::Select module in the Solution section. It bypasses bitmasks and is, generally, the easier route.

A full explanation of the exceptional data tested for with the third bitmask in
select
is beyond the scope of this book. Consult Stevens's
Unix Network Programming
for a discussion of out-of-band and urgent data.

Other
send
and
recv
flags are listed in the manpages for those system calls.