Exploring Java's Network API: Sockets

One of Java's strengths is simplified support for the development of network software. That support manifests itself through Java's Network API, a collection of classes and interfaces located in the packages java.net and javax.net. Jeff Friesen explores Java's Network API by first investigating sockets  what the concept of a socket involves and what comprises a socket. Along the way, he explores stream and datagram sockets and teaches how to work with those socket categories via the classes InetAddress, Socket, ServerSocket, DatagramPacket, DatagramSocket, and MulticastSocket.

From the author of

From the author of

Simplified support for the development of network software is one of
Java's strengths. That support manifests itself through Java's Network
API, a collection of classes and interfaces located in packages java.net
and javax.net. While writing my book Java 2 by Example, Second
Edition (Que, 2000), I intended to include a chapter on the Network API.
Unfortunately, I ran out of time and that chapter did not make it into my book.
Because the thought of not including a chapter on the Network API bothered me, I
decided to create a trilogy of articles that explores that API. The article that
you are currently reading and its companion articles form that trilogy and serve
as my book's final chapter.

NOTE

My articles explore the Network API in the context of the
Internet, a global collection of interconnected networks. If you are not
acquainted with the term, a network is an interconnected set of computers
and other devices that enables communication and resource sharing. Each
networked computer is known as a host.

This article introduces you to the sockets concept. You then have an opportunity
to work with the sockets portion of the Network API. Once you finish this article,
you will be capable of using sockets for low-level network communications. The second article introduces you to
the concepts of URIs and URLs. You then have an opportunity to work with the
Network API's URI, URL, and URL-related classes.
Once you finish the next article, you will be capable of using URL
(and related classes) for high-level network communications with the Internet's
World Wide Web (WWW).

Have you ever wanted to know how electronic mail (e-mail) works? The final Network API article explores e-mail. You learn the anatomy of an e-mail
message, how to send an e-mail message, and how to receive an e-mail message.
Once you finish that article, you will be capable of building GUI-based programs
to send and receive e-mail.

NOTE

Version 1.4 (Beta 2) of Sun's Java 2 Standard Edition (J2SE) SDK was
used to build this article's programs.

What Is a Socket?

The Network API is typically used to enable communication between a Java
program and another program across a TCP/IP[nd]based network, such as the
Internet. To enable communication, the Network API relies upon sockets. A
socket is an endpoint in a communication link between two programs. One
program writes a message (a sequence of bytes) to a socket, which
forwards that message to the other socket, which makes that message available to
the other program, as illustrated in Figure 1.

Figure 1
Two programs use sockets to communicate with each other across a TCP/IP-based
network.

According to Figure
1, Program A on Host A is writing a message to a socket. The contents of
that socket are accessed by Host A's network-management software, which
sends the message through Host A's network interface card (NIC) to Host
B. Host B's NIC gets the message and passes it to Host B's network-management
software, which deposits the message in Host B's socket. Program B can
then read that message from the socket.

Suppose that a third host is added to Figure
1's network. How does Host A know that the message is meant for Host
B and not for the new host? Each host attached to a TCP/IP[nd]based network
is given a unique IP address, which is (usually) a 32-bit unsigned integer
that makes it possible to distinguish among hosts. (An IP address is analogous
to a street address.) Because people do not converse in binary, IP addresses
are often shown using dotted-decimal notation. An example is 198.163.227.6.
As you can see, there are four components comprising the address: 198, 163,
227, and 6. Each component ranges from 0 through 255 (inclusive) and accounts
for 8 bits of the address.

NOTE

IP addresses that occupy 32 bits are known as IPv4 (Internet Protocol
version 4) addresses. Because the Internet is running out of IPv4 addresses,
IPv4 is slowly being replaced with IPv6 (Internet Protocol version 6). Unlike
IPv4 addresses, an IPv6 address is a 128-bit unsigned integer.

Suppose that a second network-aware program is added to Host B in Figure 1's
network. How does Host A know that the message is meant for Program B and not
for the new program? Each program communicating over a TCP/IP[nd]based network
is given a unique port and port number. A port is a message buffer that
holds a socket's incoming/outgoing message, and the port number
is a 16-bit unsigned integer ranging from 0 through 65,535 (inclusive) that
identifies a port and makes it possible to distinguish among network-aware programs
on a given host. (A port number is analogous to the box number of a house on
a street.) Port numbers less than 256 are reserved for standard programs, such
as POP3's port number 110. (I discuss POP3 in my third
article in this series.)

Each socket combines an IP address with a port and a port number. Those
entities identify that socket to other sockets. Subsequent sections explore two
categories of sockets: stream and datagram.

NOTE

This section referred to TCP/IP without providing any explanation of that
term. TCP/IP is an acronym for Transmission Control Protocol/Internet
Protocol, the main network protocols (rules for formatting messages and
routing those messages among hosts) found in a host's network-management
software. IP routes message chunks, known as IP packets, to the correct
host by using each IP packet's embedded IP address. TCP establishes a
connection between two hosts for sending and receiving messages consisting of
multiple IP packets. On the sending end, TCP divides a message into multiple IP
packets and relies on IP to deliver those IP packets to their destination host.
On the receiving end, TCP assembles those IP packets into the original message.
A third network protocol comprising TCP/IP[md]User Datagram Protocol
(UDP)[md]allows a message that fits into a single IP packet to be sent without
requiring a connection. TCP is a reliable but slow network protocol: It
guarantees that a message will reach its destination (without errors), but it
takes time to establish a connection. By contrast, UDP is an unreliable but fast
network protocol: It does not guarantee that a message will reach its
destination (or arrive without errors), but it does not need to take time
establishing a connection.