WebSockets: Writing the Client

We're going to do a little three-part introduction to WebSockets, which
provide a way to communicate back and forth with the web server without
all the overhead of a standard HTTP
connection.

And, in the course of it, we'll be writing a simple chat server and
client.

We're going to be piggybacking on the previous blog entry, on writing
A NodeJS Webserver. In particular,
that webserver will be used for generic webserving on this project, and
will also be what the WebSockets run on.

Before We Begin

If you're the type who wants to download the code and see it run
before you do anything, pop on over
to the final installment and go through the first couple sections on
running the server. Then come back. :)

The Client Side

Now that we have an idea of what WebSockets are, it's time to write
the client.

The basic approach is:

Make the UI components.

Create a new WebSocket object with the destination URL and desired
protocol.

WebSocket URLs will begin with ws:// or
wss:// for encrypted connections.

Set up listeners for the open, close,
error, and message events on the WebSockets object.

Write the handlers to handle those events.

Let's start with the UI.

The User Interface

Before we tear into what those handler functions should do, we should
probably define the UI they'll be doing it to.

This is a chat program, so let's define the UI to have a chat output
area where all the messages go, a text field where you can enter your
name, a text field where you enter your chat message, and a "Send"
button.

We'll set the CSS of chat-output to have a good height
and overflow:auto so that it puts scrollbars on as necessary:

CSS

#chat-output {
height: 20ex;overflow: auto;
border: 1px solid gray;
}

And that should give you something that looks like this:

The best, most well-researched UI in the history of all
humanity, I'm sure.

Interfacing with the User Interface (from the program's
perspective)

We're going to do a quick diversion on how to interact with the
HTML elements in the DOM. We
need to be able to get the value of the text inputs so we can send that
data to the server. We also need to be able to add stuff to the chat
output window and have it automatically scroll down to the bottom when
it fills up.

There's a function you can call to get a reference to any one of the
DOM elements, identified by the same selector you use in CSS. For
example, to get a reference to the chat-input field, you can
use querySelector() to get it:

JavaScript

// Get a reference to the DOM element with ID "chat-input"
var chatInput = document.querySelector('#chat-input');

In my code, I have a little helper function to make this a more
reasonable length call in practical use:

For the main chat window with all the messages on it, we do things a
little differently, since it doesn't have a value.
(value is only used on <input> elements, and
the chat output window is a <div>.)

To get and set the contents of the chat output window, we'll use its
innerHTML property, which gets or sets whatever HTML is within
it. As a bonus, this means we can decorate the messages that show up in
the chat window with HTML markup to make them italic or bold or
whatever. Any valid HTML can be packed inside.

We'll add a function to write a line of text to the chat output, and
prepend it with a newline if necessary (which we'll represent with an
HTML <br> tag).

First, we get a reference to the output window, chat-output.
Then we get its current contents and store them in innerHTML.
We compute the new output, append it to the previous value in
innerHTML, and reassign it back into the chat-output
element.

And, finally, as you see, there's a bit of magic code at the bottom
to set the scrolling position of the window so that the bottom-most
content is still displayed. This effectively auto-scrolls the window
when content falls off the bottom.

And that's all there is to that. The WebSocket will be created, and
the events will occur as appropriate.

Now, some of those events happen in an expected order. Firstly,
we'd expect to get an open event before anything else arrives.
With the possible exception of an error event, which means the
computer messed something up, because it's never our fault,
right?

Actually, we're going to do something a little more clever with the
above code. We're going to create the WebSocket like this:

That is, instead of hardcoding it to look at localhost
(which isn't particularly useful for people over the network), we code it
up to say, "Open the WebSocket to the same host-and-port I loaded this
web page from."

What's this parseLocation() function, though? We code that
up using a hackish
little trick wherein you set the href attribute of an
<a> tag with the URL, and let it do the hard work. Once
you do that, you can get the host and port from the host
attribute, just like we've done, above.

ws.send() is what's going to actually send the data to the
server. It calls a helper function first: makeMessage(). All
makeMessage() does is make an object with the type and payload,
and convert it into a JSON string.

And away we go! The data's on its way to the server!

onSocketMessage implementation

And how do we get data back from the server?

We're going to be getting a JSON string from the server, so first
we'll parse that into a JavaScript Object with
JSON.parse().

Then we'll look at the message object's type property to see
what kind of message it is.

This is actually pretty important. See, we're putting HTML
we're getting from the server right on the screen, and the server's
getting it right from other users. The browser will happily inject
whatever HTML it gets, and will execute it.

The last thing you need is an attacker sending you a chat message
that reads,
"<script>sendMeYourLoginCredentialsBwaHaHa()</script>!!" and
having your browser execute it instead of displaying it on the screen.

The solution to this problem is to translate all the HTML special
characters into their associated HTML entities. At
the very least, the following should be done (starting with the first
one):

Shortcomings and Room for Improvement

Since this is just a toy program, there are definitely a few things
wrong with it.

First of all, the usernames are sent in a very ungainly way that (as
you'll see) makes it difficult for the server to keep track of names. It
would be better to have the user log in and pass a username packet to the
server. And then if they changed their name, pass a new username packet.

Secondly, you can trivially impersonate anyone by entering their
username.

Also, even if you stopped that, there's nothing in the packet
structure to prevent other people from impersonating you. Eventually,
the server would have to keep track of you by unique user ID if you
wanted to mitigate that.

Obviously the UI is horrible.

The code isn't structured in a way that makes it easy to start new
instances of chat windows. I could imagine a case where you might want
multiple chat windows per page, so you could refactor to code to support this.

And there are plenty of other bonus shortcomings I'm too tired to
think of right now, I'm sure.

Up Next

The only remaining piece in the series is the WebSockets server side,
so we'll tackle that last. See you then!