Inspecting the X Client-Server handshake using gdb

"A computer handshake is basically this: its a term used to describe the process of one computer establishing a connection with another computer or device. The handshake is often the steps of verifying the connection, the speed, and/or the authorization of the computer trying to connect to it."

In the start of the X Client-Server connection, as we describe in the current section, a three-way
handshake takes place, where a number of structs (xConnClientPrefix, xConnSetupPrefix,
xConnSetup,e.t.c.) is exchanged between the two sides. As a result the client learns about the capabilities of the server's graphics system, which allows for the beginning of the communication at the
X protocol layer (the client's X requests and the server's replies).

To make some hands-on experiment on the phase of the X client-server connection establishment,
that take place with the XOpenDisplay() call, described in section 1.2 the following tools will be used:

I. Using the GNU Debugger (gdb)

The aim here is to examine the Display struct, which is returned by XOpenDisplay() at the test
program at the connection phase between the X client and the X server. To give the GNU debugger
the ability to examine the fields of the Display struct we change the previous program so that
it includes the Xlibint.h header file, which defines Display:

With the previous shell command we start our program (./test) in gdb. To make things more
interesting we use the tcp protocol (tcp/ip sokets) for the message exchange between the X
client and the X server, instead the Unix Sockets that are used by default when no protocol is
explicitly stated:

gdb ./test

Using the tcp sockets we define with the first shell command
tcp as the protocol used. Also 192.168.1.1 as the IP of the
X Server. This is the IP address of the system we use that host both client and server.
The zero pair that follow the semicolon correspond to the display number and the screen number. The
display is actually the specific X Server we will use (in case there are more than one running
in the same system) and the screen number (in the case we have a multiheaded system). With the
second command, since no protocol is used we run locally with Unix sockets and therefore
no IP address is required. More details can be found in
ConnDis.c, where the following are explained:

[protocol/] [hostname] : [:] displaynumber [.screennumber]

Before starting gdb the following command should be used with root permissions, in order to
allow at my Ubuntu system to accept tcp connections from any host:

sudo xhost +

The first instruction we use in gdb places a breakpoint at line 21. This stops program
execution after the following instruction:

d=XOpenDisplay(NULL);

Then we run the program (until line 21) with the run instruction. We type:

(gdb)break 21

and

(gdb)run

Next we show the fields of struct d (the Display) with the following gdb command:

In the previous frame we find struct Display as defined in Xlibint.h. We noted previously that the
aim of this section is to examine some of the Display field
values, that include info of the initial X Client-X Server negotiation. Display is the structure
returned by XOpenDisplay(). This
should not be confused with the display (usually starts with low case 'l'),
which is used in the UNIX, X Window terminology to indicate the X Server's display.
The latter is the one and only
argument of XOpenDisplay(), a pointer to a string, that either has the form:

[protocol/] [hostname] : [:] displaynumber [.screennumber]

or it is left NULL. This is initially provided by the command line, when the X Client process
is invoked, or if it is NULL it is derived from the environment variable DISPLAY.

In other words display is used by XOpenDisplay() to create Display and then is stored as a
the 'display_name' member of the Display struct.

As we saw in section 1.2 the Display
fields are initialised in
XOpenDisplay(), where many Display fields obtain their initial value. The process of filling
the Display struct is completed during the initial X client-server negotiation. In the following
box we see that the information gathered from several sources:

If we follow the previous code samples of XOpenDisplay() we see that the Display fields are filled from
different function calls inside XOpenDisplay(), for instance _X11TransConnectDisplay() and
_XSendClientPrefix(). The former provides the initial connection request and the second is used
by the client to send to the X server the initial connection setup information and also
authorization information.

After _X11TransConnectDisplay() returns trans_conn is filled and also the arguments passed by
reference, for instance 'fullname', are used to fill some more Display fields:

dpy->display_name = fullname;

As we previously saw all the Display fields are returned in gdb with the 'print *d' command,
where d is the variable that represents the Display. Individual fields of the Display struct,
for instance 'display_name', can be accessed with the Gnu debugger as:

The client prefix with authorization info is sent to the X Server. The prefix is represented here by
'client', which is filled with info about the X Protocol version, the Protocol Revision and the
endianess of the system ('l' ascii character for little Endian and 'B' ascii character for Big
Endian). Variable client is of type
xConnClientPrefix, which is defined as:

This client's request is combined with an _XRead() call (discussed in
section 1.2), which reads from the socket the
X Servers' reply and then according to this reply some more Display fields are filled:

Section 3.2.5.5 describes how the
X Server reads the client's request, which is xConnClientPrefix (the type that 'client' was defined in
XOpenDisplay), and replies with a xConnSetupPrefix message. We see at this section that the
X Server processes the info send by the client, for instance the byte order and the version
and sends back with the xConnSetupPrefix message info that also fills some Display struct, for
instance majorVersion, the X Protocol version, which is compared with the client's version in the
XOpenDisplay() code to see if it matches.

We can verify that the X Protocol Version is 11 and the Revision 0,
by entering the 'X -version' command at the shell:

Next, if the setup prefix, send by the Server has the success field
xTrue (defined as 1), which means the
connection was accepted by the Server another _XRead() is issued to read some additional info
sent by the X Server (the number of bytes are indicated by the 'length'
field in xConnSetupPrefix). This info, sent in the form of
xConnSetup (variable u in XOpenDisplay() is an union):

and certainly we can inspect those values of the Display fields with the 'print' gdb command:

From those results we see that the screen number is one, the release corresponds to the
version that we found previously (1.7.6) and the max request size has a length of two bytes
(216 = 65,536 or 0-65,535). Also the byte order is little endian (0). The latter
is concluded in the
SendConnSetup code, where the macro
ClientOrder is used, and this calls
ServerOrder. Although the client sends the byte order as 'l' and 'B' for little and Big
endian respectively, the X Server replies with 0 for little endian (LSBFirst) and 1 for big
endian (MSBFirst). See also the next comment from
XSCOPE: A Debugging and Performance Tool for X11:
"On the other hand, the names ``MSB first'' and ``LSB first'' are defined, in two different contexts, as either 0x42 and 0x6C (setup message from the client to the server) or 1 and 0 (setup reply from the server to the client)."

The X Client - X Server connection

After the initial connection which took place in the sockets interprocess communication layer
between the X Client and the X Server, described in the subsections of
section 1.4 and also
section 3.2.5.5,
the X client sends in the XOpenDisplay()
code the initial setup information, the client prefix:

Then with an _Xread() call, resolved to a read() the Server's reply is received by the Client:

_XRead (dpy, (char *)&prefix,(long)SIZEOF(xConnSetupPrefix));

Another _Xread() is issued by the Client to receive xConnSetup, the additional bytes of Server
info. This is indicated by field 'length' in xConnSetupPrefix:

_XRead (dpy, (char *)u.setup, setuplength);

In the previous figure someone could make the hypothesis that the socket
functions used at this stage was read() and write(). As we explain at the
end of the current section the actual routines turn to be read() and writev().

In the current section we examined the Display struct fields using the gdb.
We described also the process of the initial message exchange (xConnSetupPrefix, xConnSetup)
that fills many of Display fields. In the
next section we have a closer look at those messages
and we examine them with a real-world experiment.

The read(), writev() calls used by the X Client and X Server

To find why the read(), writev() system calls are used between the X Client and X Server among
the other choices we have to look at the client and server sides.

I. As we described in the current section
XOpenDisplay() calls
_XSendClientPrefix() to send the initial connection setup information from the client, the
xConnClientPrefix. The _XSendClientPrefix() call is resolved to a _X11TransWritev(). This is
resolved to a
TRANS(Writev) as we previously saw in
section 1.2 with the _X11TransRead() call. As we also see in section 1.2
TRANS(Writev) for the TRANS(SocketTCPFuncs) becomes
TRANS(SocketWritev), which is an encapsulation of
WRITEV(). As we read in
Xtransint.h WRITEV for UNIX is the writev() system call.

IV.
We move now to the Server side to search about the read() call, marked IV in the previous figure.
This is covered mainly by
section 3.2.5.5. As we read in this section
"EstablishNewConnections(), called from WaitForSomething() via QueueWorkProc(), in the
Dispatch() loop is the X Server mechanism to accept new X Clients." EstablishNewConnections
calls AllocNewConnection, which calls NextAvailableClient() and this calls
InitClient(). The latter sets the readRequest routine for the client to
StandardReadRequestFromClient. This is resolved to _XSERVTransRead, actually a read()
system call, as we see in section 3.2.5.3.

As we read in section 2.2 InitClient()
sets the client's vector to 'InitialVector'.

InitialVector[] has only three elements and
NextAvailableClient() sets the request type to 1, which means that the second routine,
ProcInitialConnection(), will dispatch the client's request. The request data also are of type
xConnClientPrefix. NextAvailableClient() prepares a "Fake Request", that is a request prepared
by the server in behalf of the client and not by the client itself, and uses InsertFakeRequest() in order to place the 'InitialConnection' request to the input buffer of this specific client.

When WaitForSomething() finally returns to Dispatch() new clients have ready requests and
ReadRequestFromClient() is called to read those requests for each of those clients. For the new
client (request type 1) ProcInitialConnection() is executed to determine the endianess of the
client. Another 'Fake' request which on behalf of the client is used with request type 2, which
corresponds to ProcEstablishConnection().

V.
ProcEstablishConnection() checks for possible authentication required by the server
and it calls
SendConnSetup(). SendConnSetup() updates the client's requestvector
and uses
WriteToClient() to send back to the client the xConnSetupPrefix:

WriteToClient() uses
FlushClient(), in the case the client's buffer becomes full, and this _XSERVTransWritev(), which aw we see in
section 2.2 is the expansion of
TRANS(Writev), which as we mentioned previously is resolved to a writev() call. If the
client's buffer is not full it just copies there the client's data and waits for
FlushAllOutput() (see section 3.2.5 and
section 3.2.5.4), which also calls
FlushClient().

II.
The X Client after sending the xConnClientPrefix message to the X Server (step I) it reads the
server's reply with the following instruction in XOpenDisplay():

_XRead (dpy, (char *)&prefix,(long)SIZEOF(xConnSetupPrefix));

The _XRead() as we see in section 1.2 is
resolved to a read() system call.

VI.
Just after the X Server sends the xConnSetupPrefix, in step V, it sends also the main info
(xConnSetup, e.t.c.) with the following WriteToClient call: