A socket is an endpoint for communication using the facilities described in this section. A socket is created with a specific
socket type, described in Socket Types , and is associated with a specific protocol, detailed in Protocols. A socket is accessed via a file descriptor obtained when the socket is created.

All network protocols are associated with a specific address family. An address family provides basic services to the protocol
implementation to allow it to function within a specific network environment. These services may include packet fragmentation and
reassembly, routing, addressing, and basic transport. An address family is normally comprised of a number of protocols, one per
socket type. Each protocol is characterized by an abstract socket type. It is not required that an address family support all
socket types. An address family may contain multiple protocols supporting the same socket abstraction.

An address family defines the format of a socket address. All network addresses are described using a general structure, called
a sockaddr, as defined in the Base Definitions volume of IEEE Std 1003.1-2001, <sys/socket.h>. However, each address family imposes finer and more specific
structure, generally defining a structure with fields specific to the address family. The field sa_family in the
sockaddr structure contains the address family identifier, specifying the format of the sa_data area. The size of the
sa_data area is unspecified.

A protocol supports one of the socket abstractions detailed in Socket Types. Selecting a protocol
involves specifying the address family, socket type, and protocol number to the socket() function. Certain semantics of the basic socket abstractions are protocol-specific.
All protocols are expected to support the basic model for their particular socket type, but may, in addition, provide non-standard
facilities or extensions to a mechanism.

Each network interface in a system corresponds to a path through which messages can be sent and received. A network interface
usually has a hardware device associated with it, though certain interfaces such as the loopback interface, do not.

A socket is created with a specific type, which defines the communication semantics and which allows the selection of an
appropriate communication protocol. Four types are defined: [RS] SOCK_RAW, SOCK_STREAM, SOCK_SEQPACKET, and SOCK_DGRAM. Implementations may specify additional socket types.

The SOCK_STREAM socket type provides reliable, sequenced, full-duplex octet streams between the socket and a peer to which the
socket is connected. A socket of type SOCK_STREAM must be in a connected state before any data may be sent or received. Record
boundaries are not maintained; data sent on a stream socket using output operations of one size may be received using input
operations of smaller or larger sizes without loss of data. Data may be buffered; successful return from an output function does
not imply that the data has been delivered to the peer or even transmitted from the local system. If data cannot be successfully
transmitted within a given time then the connection is considered broken, and subsequent operations shall fail. A SIGPIPE signal is
raised if a thread sends on a broken stream (one that is no longer connected). Support for an out-of-band data transmission
facility is protocol-specific.

The SOCK_SEQPACKET socket type is similar to the SOCK_STREAM type, and is also connection-oriented. The only difference between
these types is that record boundaries are maintained using the SOCK_SEQPACKET type. A record can be sent using one or more output
operations and received using one or more input operations, but a single operation never transfers parts of more than one record.
Record boundaries are visible to the receiver via the MSG_EOR flag in the received message flags returned by the recvmsg() function. It is protocol-specific whether a maximum record size is imposed.

The SOCK_DGRAM socket type supports connectionless data transfer which is not necessarily acknowledged or reliable. Datagrams
may be sent to the address specified (possibly multicast or broadcast) in each output operation, and incoming datagrams may be
received from multiple sources. The source address of each datagram is available when receiving the datagram. An application may
also pre-specify a peer address, in which case calls to output functions shall send to the pre-specified peer. If a peer has been
specified, only datagrams from that peer shall be received. A datagram must be sent in a single output operation, and must be
received in a single input operation. The maximum size of a datagram is protocol-specific; with some protocols, the limit is
implementation-defined. Output datagrams may be buffered within the system; thus, a successful return from an output function does
not guarantee that a datagram is actually sent or received. However, implementations should attempt to detect any errors possible
before the return of an output function, reporting any error by an unsuccessful return value.

[RS] The
SOCK_RAW socket type is similar to the SOCK_DGRAM type. It differs in that it is normally used with communication providers that
underlie those used for the other socket types. For this reason, the creation of a socket with type SOCK_RAW shall require
appropriate privilege. The format of datagrams sent and received with this socket type generally include specific protocol headers,
and the formats are protocol-specific and implementation-defined.

The I/O mode of a socket is described by the O_NONBLOCK file status flag which pertains to the open file description for the
socket. This flag is initially off when a socket is created, but may be set and cleared by the use of the F_SETFL command of the fcntl() function.

When the O_NONBLOCK flag is set, functions that would normally block until they are complete shall either return immediately
with an error, or shall complete asynchronously to the execution of the calling process. Data transfer operations (the read(), write(), send(), and recv() functions) shall complete
immediately, transfer only as much as is available, and then return without blocking, or return an error indicating that no
transfer could be made without blocking. The connect() function initiates a
connection and shall return without blocking when O_NONBLOCK is set; it shall return the error [EINPROGRESS] to indicate that the
connection was initiated successfully, but that it has not yet completed.

The transmit and receive queue sizes for a socket are set when the socket is created. The default sizes used are both
protocol-specific and implementation-defined. The sizes may be changed using the setsockopt() function.

Errors may occur asynchronously, and be reported to the socket in response to input from the network protocol. The socket stores
the pending error to be reported to a user of the socket at the next opportunity. The error is returned in response to a subsequent
send(), recv(), or getsockopt() operation on the socket, and the pending error is then cleared.

A socket has a receive queue that buffers data when it is received by the system until it is removed by a receive call.
Depending on the type of the socket and the communication provider, the receive queue may also contain ancillary data such as the
addressing and other protocol data associated with the normal data in the queue, and may contain out-of-band or expedited data. The
limit on the queue size includes any normal, out-of-band data, datagram source addresses, and ancillary data in the queue. The
description in this section applies to all sockets, even though some elements cannot be present in some instances.

The contents of a receive buffer are logically structured as a series of data segments with associated ancillary data and other
information. A data segment may contain normal data or out-of-band data, but never both. A data segment may complete a record if
the protocol supports records (always true for types SOCK_SEQPACKET and SOCK_DGRAM). A record may be stored as more than one
segment; the complete record might never be present in the receive buffer at one time, as a portion might already have been
returned to the application, and another portion might not yet have been received from the communications provider. A data segment
may contain ancillary protocol data, which is logically associated with the segment. Ancillary data is received as if it were
queued along with the first normal data octet in the segment (if any). A segment may contain ancillary data only, with no normal or
out-of-band data. For the purposes of this section, a datagram is considered to be a data segment that terminates a record, and
that includes a source address as a special type of ancillary data. Data segments are placed into the queue as data is delivered to
the socket by the protocol. Normal data segments are placed at the end of the queue as they are delivered. If a new segment
contains the same type of data as the preceding segment and includes no ancillary data, and if the preceding segment does not
terminate a record, the segments are logically merged into a single segment.

The receive queue is logically terminated if an end-of-file indication has been received or a connection has been terminated. A
segment shall be considered to be terminated if another segment follows it in the queue, if the segment completes a record, or if
an end-of-file or other connection termination has been reported. The last segment in the receive queue shall also be considered to
be terminated while the socket has a pending error to be reported.

A receive operation shall never return data or ancillary data from more than one segment.

The handling of received out-of-band data is protocol-specific. Out-of-band data may be placed in the socket receive queue,
either at the end of the queue or before all normal data in the queue. In this case, out-of-band data is returned to an application
program by a normal receive call. Out-of-band data may also be queued separately rather than being placed in the socket receive
queue, in which case it shall be returned only in response to a receive call that requests out-of-band data. It is
protocol-specific whether an out-of-band data mark is placed in the receive queue to demarcate data preceding the out-of-band data
and following the out-of-band data. An out-of-band data mark is logically an empty data segment that cannot be merged with other
segments in the queue. An out-of-band data mark is never returned in response to an input operation. The sockatmark() function can be used to test whether an out-of-band data mark is the first
element in the queue. If an out-of-band data mark is the first element in the queue when an input function is called without the
MSG_PEEK option, the mark is removed from the queue and the following data (if any) is processed as if the mark had not been
present.

Sockets that are used to accept incoming connections maintain a queue of outstanding connection indications. This queue is a
list of connections that are awaiting acceptance by the application; see listen().

One category of event at the socket interface is the generation of signals. These signals report protocol events or process
errors relating to the state of the socket. The generation or delivery of a signal does not change the state of the socket,
although the generation of the signal may have been caused by a state change.

The SIGPIPE signal shall be sent to a thread that attempts to send data on a socket that is no longer able to send. In addition,
the send operation fails with the error [EPIPE].

If any of the following conditions occur asynchronously for a socket, the corresponding value listed below shall become the
pending error for the socket:

[ECONNABORTED]

The connection was aborted locally.

[ECONNREFUSED]

For a connection-mode socket attempting a non-blocking connection, the attempt to connect was forcefully rejected. For a
connectionless-mode socket, an attempt to deliver a datagram was forcefully rejected.

[ECONNRESET]

The peer has aborted the connection.

[EHOSTDOWN]

The destination host has been determined to be down or disconnected.

[EHOSTUNREACH]

The destination host is not reachable.

[EMSGSIZE]

For a connectionless-mode socket, the size of a previously sent datagram prevented delivery.

There are a number of socket options which either specialize the behavior of a socket or provide useful information. These
options may be set at different protocol levels and are always present at the uppermost "socket" level.

Socket options are manipulated by two functions, getsockopt() and setsockopt(). These functions allow an application program to customize the behavior and
characteristics of a socket to provide the desired effect.

All of the options have default values. The type and meaning of these values is defined by the protocol level to which they
apply. Instead of using the default values, an application program may choose to customize one or more of the options. However, in
the bulk of cases, the default values are sufficient for the application.

Some of the options are used to enable or disable certain behavior within the protocol modules (for example, turn on debugging)
while others may be used to set protocol-specific information (for example, IP time-to-live on all the application's outgoing
packets). As each of the options is introduced, its effect on the underlying protocol modules is described.

Socket-Level Options lists those options present at the socket level; that is, when the level
parameter of the getsockopt() or setsockopt() function is SOL_SOCKET, the types of the option value parameters associated
with each option, and a brief synopsis of the meaning of the option value parameter. Unless otherwise noted, each may be examined
with getsockopt() and set with setsockopt() on all types of socket.

The SO_BROADCAST option requests permission to send broadcast datagrams on the socket. Support for SO_BROADCAST is
protocol-specific. The default for SO_BROADCAST is that the ability to send broadcast datagrams on a socket is disabled.

The SO_DEBUG option enables debugging in the underlying protocol modules. This can be useful for tracing the behavior of the
underlying protocol modules during normal system operation. The semantics of the debug reports are implementation-defined. The
default value for SO_DEBUG is for debugging to be turned off.

The SO_DONTROUTE option requests that outgoing messages bypass the standard routing facilities. The destination must be on a
directly-connected network, and messages are directed to the appropriate network interface according to the destination address. It
is protocol-specific whether this option has any effect and how the outgoing network interface is chosen. Support for this option
with each protocol is implementation-defined.

The SO_ERROR option is used only on getsockopt(). When this option is
specified, getsockopt() shall return any pending error on the socket and clear
the error status. It shall return a value of 0 if there is no pending error. SO_ERROR may be used to check for asynchronous errors
on connected connectionless-mode sockets or for other types of asynchronous errors. SO_ERROR has no default value.

The SO_KEEPALIVE option enables the periodic transmission of messages on a connected socket. The behavior of this option is
protocol-specific. The default value for SO_KEEPALIVE is zero, specifying that this capability is turned off.

The SO_LINGER option controls the action of the interface when unsent messages are queued on a socket and a close() is performed. The details of this option are protocol-specific. The default value for
SO_LINGER is zero, or off, for the l_onoff element of the option value and zero seconds for the linger time specified by the
l_linger element.

The SO_OOBINLINE option is valid only on protocols that support out-of-band data. The SO_OOBINLINE option requests that
out-of-band data be placed in the normal data input queue as received; it is then accessible using the read() or recv() functions without the MSG_OOB flag
set. The default for SO_OOBINLINE is off; that is, for out-of-band data not to be placed in the normal data input queue.

The SO_RCVBUF option requests that the buffer space allocated for receive operations on this socket be set to the value, in
bytes, of the option value. Applications may wish to increase buffer size for high volume connections, or may decrease buffer size
to limit the possible backlog of incoming data. The default value for the SO_RCVBUF option value is implementation-defined, and may
vary by protocol.

The maximum value for the option for a socket may be obtained by the use of the fpathconf() function, using the value _PC_SOCK_MAXBUF.

The SO_RCVLOWAT option sets the minimum number of bytes to process for socket input operations. In general, receive calls block
until any (non-zero) amount of data is received, then return the smaller of the amount available or the amount requested. The
default value for SO_RCVLOWAT is 1, and does not affect the general case. If SO_RCVLOWAT is set to a larger value, blocking receive
calls normally wait until they have received the smaller of the low water mark value or the requested amount. Receive calls may
still return less than the low water mark if an error occurs, a signal is caught, or the type of data next in the receive queue is
different from that returned (for example, out-of-band data). As mentioned previously, the default value for SO_RCVLOWAT is 1 byte.
It is implementation-defined whether the SO_RCVLOWAT option can be set.

The SO_RCVTIMEO option is an option to set a timeout value for input operations. It accepts a timeval structure with the
number of seconds and microseconds specifying the limit on how long to wait for an input operation to complete. If a receive
operation has blocked for this much time without receiving additional data, it shall return with a partial count or errno
shall be set to [EWOULDBLOCK] if no data were received. The default for this option is the value zero, which indicates that a
receive operation will not time out. It is implementation-defined whether the SO_RCVTIMEO option can be set.

The SO_REUSEADDR option indicates that the rules used in validating addresses supplied in a bind() should allow reuse of local addresses. Operation of this option is protocol-specific.
The default value for SO_REUSEADDR is off; that is, reuse of local addresses is not permitted.

The SO_SNDBUF option requests that the buffer space allocated for send operations on this socket be set to the value, in bytes,
of the option value. The default value for the SO_SNDBUF option value is implementation-defined, and may vary by protocol. The
maximum value for the option for a socket may be obtained by the use of the fpathconf() function, using the value _PC_SOCK_MAXBUF.

The SO_SNDLOWAT option sets the minimum number of bytes to process for socket output operations. Most output operations process
all of the data supplied by the call, delivering data to the protocol for transmission and blocking as necessary for flow control.
Non-blocking output operations process as much data as permitted subject to flow control without blocking, but process no data if
flow control does not allow the smaller of the send low water mark value or the entire request to be processed. A select() operation testing the ability to write to a socket shall return true only if the
send low water mark could be processed. The default value for SO_SNDLOWAT is implementation-defined and protocol-specific. It is
implementation-defined whether the SO_SNDLOWAT option can be set.

The SO_SNDTIMEO option is an option to set a timeout value for the amount of time that an output function shall block because
flow control prevents data from being sent. As noted in Socket-Level Options , the option value is a
timeval structure with the number of seconds and microseconds specifying the limit on how long to wait for an output
operation to complete. If a send operation has blocked for this much time, it shall return with a partial count or errno set
to [EWOULDBLOCK] if no data were sent. The default for this option is the value zero, which indicates that a send operation will
not time out. It is implementation-defined whether the SO_SNDTIMEO option can be set.

The SO_TYPE option is used only on getsockopt(). When this option is
specified, getsockopt() shall return the type of the socket (for example,
SOCK_STREAM). This option is useful to servers that inherit sockets on start-up. SO_TYPE has no default value.

UNIX domain sockets provide process-to-process communication in a single system.

Headers

The symbolic constant AF_UNIX defined in the <sys/socket.h> header is
used to identify the UNIX domain address family. The <sys/un.h> header
contains other definitions used in connection with UNIX domain sockets. See the Base Definitions volume of
IEEE Std 1003.1-2001, Chapter 13, Headers.

The sockaddr_storage structure defined in <sys/socket.h> shall
be large enough to accommodate a sockaddr_un structure (see the <sys/un.h> header defined in the Base Definitions volume of
IEEE Std 1003.1-2001, Chapter 13, Headers) and shall be aligned at an
appropriate boundary so that pointers to it can be cast as pointers to sockaddr_un structures and used to access the fields
of those structures without alignment problems. When a sockaddr_storage structure is cast as a sockaddr_un structure,
the ss_family field maps onto the sun_family field.

[RS] A
raw interface to IP is available by creating an Internet socket of type SOCK_RAW. The default protocol for type SOCK_RAW shall be
identified in the IP header with the value IPPROTO_RAW. Applications should not use the default protocol when creating a socket
with type SOCK_RAW, but should identify a specific protocol by value. The ICMP control protocol is accessible from a raw socket by
specifying a value of IPPROTO_ICMP for protocol.

Support for sockets over Internet protocols based on IPv4 is mandatory.

Headers

The symbolic constant AF_INET defined in the <sys/socket.h> header is
used to identify the IPv4 Internet address family. The <netinet/in.h>
header contains other definitions used in connection with IPv4 Internet sockets. See the Base Definitions volume of
IEEE Std 1003.1-2001, Chapter 13, Headers.

The sockaddr_storage structure defined in <sys/socket.h> shall
be large enough to accommodate a sockaddr_in structure (see the <netinet/in.h> header defined in the Base Definitions volume of
IEEE Std 1003.1-2001, Chapter 13, Headers) and shall be aligned at an
appropriate boundary so that pointers to it can be cast as pointers to sockaddr_in structures and used to access the fields
of those structures without alignment problems. When a sockaddr_storage structure is cast as a sockaddr_in structure,
the ss_family field maps onto the sin_family field.

[IP6]
This section describes extensions to support sockets over Internet protocols based on IPv6. This functionality is dependent on
support of the IPV6 option (and the rest of this section is not further marked for this option).

To enable smooth transition from IPv4 to IPv6, the features defined in this section may, in certain circumstances, also be used
in connection with IPv4; see Compatibility with IPv4.

Addressing

IPv6 overcomes the addressing limitations of previous versions by using 128-bit addresses instead of 32-bit addresses. The IPv6
address architecture is described in RFC 2373.

There are three kinds of IPv6 address:

Unicast

Identifies a single interface.

A unicast address can be global, link-local (designed for use on a single link), or site-local (designed for systems not
connected to the Internet). Link-local and site-local addresses need not be globally unique.

Anycast

Identifies a set of interfaces such that a packet sent to the address can be delivered to any member of the set.

An anycast address is similar to a unicast address; the nodes to which an anycast address is assigned must be explicitly
configured to know that it is an anycast address.

Multicast

Identifies a set of interfaces such that a packet sent to the address should be delivered to every member of the set.

An application can send multicast datagrams by simply specifying an IPv6 multicast address in the address argument of sendto(). To receive multicast datagrams, an application must join the multicast group
(using setsockopt() with IPV6_JOIN_GROUP) and must bind to the socket the UDP
port on which datagrams will be received. Some applications should also bind the multicast group address to the socket, to prevent
other datagrams destined to that port from being delivered to the socket.

A multicast address can be global, node-local, link-local, site-local, or organization-local.

The following special IPv6 addresses are defined:

Unspecified

An address that is not assigned to any interface and is used to indicate the absence of an address.

Loopback

A unicast address that is not assigned to any interface and can be used by a node to send packets to itself.

Two sets of IPv6 addresses are defined to correspond to IPv4 addresses:

IPv4-compatible addresses

These are assigned to nodes that support IPv6 and can be used when traffic is "tunneled" through IPv4.

Note that the unspecified address and the loopback address must not be treated as IPv4-compatible addresses.

Compatibility with IPv4

The API provides the ability for IPv6 applications to interoperate with applications using IPv4, by using IPv4-mapped IPv6
addresses. These addresses can be generated automatically by the getaddrinfo()
function when the specified host has only IPv4 addresses.

Applications can use AF_INET6 sockets to open TCP connections to IPv4 nodes, or send UDP packets to IPv4 nodes, by simply
encoding the destination's IPv4 address as an IPv4-mapped IPv6 address, and passing that address, within a sockaddr_in6
structure, in the connect(), sendto(),
or sendmsg() function. When applications use AF_INET6 sockets to accept TCP
connections from IPv4 nodes, or receive UDP packets from IPv4 nodes, the system shall return the peer's address to the application
in the accept(), recvfrom(), recvmsg(), or getpeername() function
using a sockaddr_in6 structure encoded this way. If a node has an IPv4 address, then the implementation shall allow
applications to communicate using that address via an AF_INET6 socket. In such a case, the address will be represented at the API
by the corresponding IPv4-mapped IPv6 address. Also, the implementation may allow an AF_INET6 socket bound to in6addr_any to
receive inbound connections and packets destined to one of the node's IPv4 addresses.

An application can use AF_INET6 sockets to bind to a node's IPv4 address by specifying the address as an IPv4-mapped IPv6
address in a sockaddr_in6 structure in the bind() function. For an AF_INET6
socket bound to a node's IPv4 address, the system shall return the address in the getsockname() function as an IPv4-mapped IPv6 address in a sockaddr_in6
structure.

Interface Identification

Each local interface is assigned a unique positive integer as a numeric index. Indexes start at 1; zero is not used. There may
be gaps so that there is no current interface for a particular positive index. Each interface also has a unique
implementation-defined name.

Options

The following options apply at the IPPROTO_IPV6 level:

IPV6_JOIN_GROUP

When set via setsockopt(), it joins the application to a multicast group on an
interface (identified by its index) and addressed by a given multicast address, enabling packets sent to that address to be read
via the socket. If the interface index is specified as zero, the system selects the interface (for example, by looking up the
address in a routing table and using the resulting interface).

An attempt to read this option using getsockopt() shall result in an
[EOPNOTSUPP] error.

The parameter type of this option is a pointer to an ipv6_mreq structure.

IPV6_LEAVE_GROUP

When set via setsockopt(), it removes the application from the multicast group on
an interface (identified by its index) and addressed by a given multicast address.

An attempt to read this option using getsockopt() shall result in an
[EOPNOTSUPP] error.

The parameter type of this option is a pointer to an ipv6_mreq structure.

IPV6_MULTICAST_HOPS

The value of this option is the hop limit for outgoing multicast IPv6 packets sent via the socket. Its possible values are the same
as those of IPV6_UNICAST_HOPS. If the IPV6_MULTICAST_HOPS option is not set, a value of 1 is assumed. This option can be set via setsockopt() and read via getsockopt().

The parameter type of this option is a pointer to an int. (Default value: 1)

IPV6_MULTICAST_IF

The index of the interface to be used for outgoing multicast packets. It can be set via setsockopt() and read via getsockopt().
If the interface index is specified as zero, the system selects the interface (for example, by looking up the address in a routing
table and using the resulting interface).

The parameter type of this option is a pointer to an unsigned int. (Default value: 0)

IPV6_MULTICAST_LOOP

This option controls whether outgoing multicast packets should be delivered back to the local application when the sending
interface is itself a member of the destination multicast group. If it is set to 1 they are delivered. If it is set to 0 they are
not. Other values result in an [EINVAL] error. This option can be set via setsockopt() and read via getsockopt().

The parameter type of this option is a pointer to an unsigned int which is used as a Boolean value. (Default value:
1)

IPV6_UNICAST_HOPS

The value of this option is the hop limit for outgoing unicast IPv6 packets sent via the socket. If the option is not set, or is
set to -1, the system selects a default value. Attempts to set a value less than -1 or greater than 255 shall result in an [EINVAL]
error. This option can be set via setsockopt() and read via getsockopt().

The parameter type of this option is a pointer to an int. (Default value: Unspecified)

IPV6_V6ONLY

This socket option restricts AF_INET6 sockets to IPv6 communications only. AF_INET6 sockets may be used for both IPv4 and IPv6
communications. Some applications may want to restrict their use of an AF_INET6 socket to IPv6 communications only. For these
applications, the IPv6_V6ONLY socket option is defined. When this option is turned on, the socket can be used to send and receive
IPv6 packets only. This is an IPPROTO_IPV6-level option.

The parameter type of this option is a pointer to an int which is used as a Boolean value. (Default value: 0)

An [EOPNOTSUPP] error shall result if IPV6_JOIN_GROUP or IPV6_LEAVE_GROUP is used with getsockopt().

Headers

The symbolic constant AF_INET6 is defined in the <sys/socket.h> header
to identify the IPv6 Internet address family. See the Base Definitions volume of IEEE Std 1003.1-2001, Chapter 13, Headers.

The sockaddr_storage structure defined in <sys/socket.h> shall
be large enough to accommodate a sockaddr_in6 structure (see the <netinet/in.h> header defined in the Base Definitions volume of
IEEE Std 1003.1-2001, Chapter 13, Headers) and shall be aligned at an
appropriate boundary so that pointers to it can be cast as pointers to sockaddr_in6 structures and used to access the fields
of those structures without alignment problems. When a sockaddr_storage structure is cast as a sockaddr_in6
structure, the ss_family field maps onto the sin6_family field.