Sockets are endpoints for communication. Some types of sockets provide reliable communications. Others offer few guarantees, but consume low system overhead. Socket communication can be used to let processes talk on just one machine or over the Internet.

In this chapter we consider the two most commonly used types of sockets:
streams
and
datagrams
. Streams provide a bidirectional, sequenced, and reliable channel of communication - similar to pipes.
Datagram
sockets do not guarantee sequenced, reliable delivery, but they do guarantee that message boundaries will be preserved when read. Your system may support other types of sockets as well; consult your
socket
(2) manpage or equivalent documentation for details.

We also consider both the Internet and Unix
domains. The Internet domain gives sockets two-part names: a host (an IP address in a particular format) and a port number. In the Unix domain, sockets are named using files (e.g.,
/tmp/mysock
).

In addition to domains and types, sockets also have a
protocol
associated with them. Protocols are not very important to the casual programmer, as there is rarely more than one protocol for a given domain and type of socket.

Domains and types are normally identified by numeric constants (available through functions exported by the Socket and IO::Socket modules). Stream sockets have the type
SOCK_STREAM, and datagram sockets have the type SOCK_DGRAM. The Internet domain is
PF_INET, and the Unix domain PF_UNIX. (POSIX uses PF_LOCAL instead of PF_UNIX, but PF_UNIX will almost always be an acceptable constant simply because of the preponderance of existing software that uses it.) You should use these symbolic names instead of numbers because the numbers may change (and historically, have).

Protocols have names like
tcp
and
udp
, which correspond to numbers that the operating system uses. The
getprotobyname
function (built into Perl) returns the number when given a protocol name. Pass protocol number
0
to socket functions to have the system select an appropriate default.

Perl has built-in functions to create and manipulate sockets; these functions largely mimic their C counterparts. While this is good for providing low-level, direct access to every part of the system, most of us prefer something more convenient. That's what the IO::Socket::INET and IO::Socket::UNIX classes are for - they provide a high-level interface to otherwise intricate system calls.

Let's look at the built-in functions first. They all return
undef
and set
$!
if an error occurs. The
socket
function makes a socket,
bind
gives a socket a local name,
connect
connects a local socket to a (possibly remote) one,
listen
readies a socket for connections from other sockets, and
accept
receives the connections one by one. You can communicate over a stream socket with
print
and
< > as well as with
syswrite
and
sysread
, or over a datagram socket with
send
and
recv
. (Perl does not currently support
sendmsg
(2).)

A typical server calls
socket
,
bind
, and
listen
, then loops in a blocking
accept
call that waits for incoming connections (see
Recipe 17.2
and
Recipe 17.5
). A typical client calls
socket
and
connect
(see
Recipe 17.1
and
Recipe 17.4
). Datagram clients are special. They don't have to
connect
to send data because they can specify the destination as an argument to
send
.

When you
bind
,
connect
, or
send
to a specific destination, you must supply a socket name. An Internet domain socket name is a host (an IP address packed with
inet_aton
) and a port (a number), packed into a C-style structure with
sockaddr_in
:

Most recipes use Internet domain sockets in their examples, but nearly everything that applies to the Internet domain also applies to the Unix domain.
Recipe 17.6
explains the differences and pitfalls.

Sockets are the basis of network services. We provide three ways to write servers: one where a child process is created for each incoming connection (
Recipe 17.11
), one where the server forks in advance (
Recipe 17.12
), and one where the server process doesn't fork at all (
Recipe 17.13
).

Some servers need to listen to many IP addresses at once, which we demonstrate in
Recipe 17.14
. Well-behaved servers clean up and restart when they get a HUP signal;
Recipe 17.16
shows how to implement that behavior in Perl. We also show how to put a name to both ends of a connection; see
Recipe 17.7
and
Recipe 17.8
.

Unix Network Programming
and the three-volume
TCP/IP Illustrated
by W. Richard Stevens are indispensable for the serious socket programmer. If you want to learn the basics about sockets, it's hard to beat the original and classic reference,
An Advanced 4.4BSD Interprocess Communication Tutorial.
It's written for C, but almost everything is directly applicable to Perl. It's available in
/usr/share/doc
on most BSD-derived Unix systems. We also recommend you look at
The Unix Programming Frequently Asked Questions List
(Gierth and Horgan), and
Programming UNIX Sockets in C - Frequently Asked Questions
(Metcalf and Gierth), both of which are posted periodically to the
comp.unix.answers
newsgroup.