The "test vector" for Sec-WebSocket-Key encoding, provided in RFC 6455 section 4.1, is wrong. It was processed by a base 64 encoder that was "improperly implemented" as mentioned in RFC 4648 section 3.5. Pad bits were not set to zero, which is a "MUST" requirement in RFC 4648, and was also required by many other RFCs, going back to RFC 989 in the 1980s. In the string "AQIDBAUGBwgJCgsMDQ4PEC==", the final C should be changed to A.

In a Sec-WebSocket-Key header sent by a properly implemented client, the last letter will always be A, Q, g or w.

However, the |Sec-WebSocket-Extensions| header field MUST NOT appear
more than once in an HTTP response.

It should say:

The |Sec-WebSocket-Extensions| header field MAY appear multiple
times in an HTTP response (which is logically the same as a single
|Sec-WebSocket-Extensions| header field that contains all values).

Notes:

Section 4.2.2 Step 5 subpart 6 (top of page 25) clearly explains that this header field may appear multiple times in the server's response: "If multiple extensions are to be used, they can all be listed in a single |Sec-WebSocket-Extensions| header field or split between multiple instances of the |Sec-WebSocket-Extensions| header field. This completes the server's handshake..."

2. If the client already has a WebSocket connection to the remote
host (IP address) identified by /host/ and port /port/ pair, even
if the remote host is known by another name, the client MUST wait
until that connection has been established or for that connection
to have failed. There MUST be no more than one connection in a
CONNECTING state. If multiple connections to the same IP address
are attempted simultaneously, the client MUST serialize them so
that there is no more than one connection at a time running
through the following steps.

It should say:

2. If the client already has a WebSocket connection to the same IP
address and port pair, even if the remote host is known by another
name, the client MUST wait until that connection has been established
or for that connection to have failed. There MUST be no more than
one connection in the CONNECTING state for any IP address and port
pair. If multiple connections to the same IP address and port pair
are attempted simultaneously, the client MUST serialize them so that
there is no more than one connection at a time running through the
following steps.

Notes:

The original wording makes it ambiguous whether distinct ports on the same host are considered distinct for throttling purposes. The first sentence implies that the throttling should be applied per (host,port) pair, whereas the final sentence implies it is only based on IP address.

Implementations of both interpretations appear to exist in the wild.

I propose disambiguating in favour of the (host,port) interpretation, because the host-only interpretation allows for a potential denial-of-service attack targeted against hosts which drop packets to unused ports.

For example, an attacker from a different origin can create several hundred WebSockets to "wss://google.com:81/". Each one will attempt to connect in serial, and take tens of seconds to time out.

If the user then attempts to use an application which legitimately uses a WebSocket to "wss://google.com:443/", then due to the throttling to google.com it will not be able to connect until all of the attacker's connections have timed out.

The user will perceive that the legitimate application is malfunctioning, since there is no visible sign that they are being attacked. From the server end, the attack is only apparent in the firewall logs. The actual rate of SYN packets dropped will be small and unlikely to trigger an alert.

On the other hand, with the (host,port) interpretation, connections to google.com:81 do not block connections to google.com:443, and this attack is completely ineffective.

=== Verifier Notes ===

The subsequent paragraph already indicates what must be done in the case where the IP address cannot be determined, so this change should not create any difficulties in the case where the connection is tunnelled via an HTTP(S) or SOCKS5 proxy.

A separate concern has been raised that this section creates problems for WebSocket proxies and non-browser clients. That issue cannot be handled without changing the meaning of the text, so it would have to be dealt with in a document update.

1011
1011 indicates that a server is terminating the connection because
it encountered an unexpected condition that prevented it from
fulfilling the request.

It should say:

1011
1011 indicates that a remote endpoint is terminating the connection
because it encountered an unexpected condition that prevented it from
fulfilling the request.

Notes:

As per the discussion in the WG (See <http://www.ietf.org/mail-archive/web/hybi/current/msg09628.html>) the meaning of this error close code should be extended to cover clients as well. As the Designated Expert for the WebSocket close code registry I've approved the corresponding change to the IANA registry.

1. The components of the WebSocket URI passed into this algorithm
(/host/, /port/, /resource name/, and /secure/ flag) MUST be
valid according to the specification of WebSocket URIs specified
in Section 3. If any of the components are invalid, the client
MUST _Fail the WebSocket Connection_ and abort these steps.

It should say:

1. The components of the WebSocket URI passed into this algorithm
(/host/, /port/, /resource name/, and /secure/ flag) MUST be
valid according to the specification of WebSocket URIs specified
in Section 3. If any of the components are invalid, the client
MUST _Fail the WebSocket Connection_ and abort these steps.
2. If secure is false, and the algorithm in Mixed Content's "§5.1
Does settings object restrict mixed content?" returns Restricts
Mixed Content when applied to client's entry script's relevant
settings object's, then the client MUST fail the WebSocket
connection and abort the connection.

Notes:

This change is suggested by the W3C's "Mixed Content" document (https://w3c.github.io/webappsec/specs/mixedcontent/#websockets-integration), and will bring WebSockets' behaviors into line with XMLHttpRequest, EventSource, and Fetch, all of which act as though there was a network error when blocking a mixed content request, rather than throwing a SecurityError exception.

Name of field and range is implying that it should be 63 bits in length, but documentation is saying 64 bits in 2 places.
--VERIFIER NOTES--
The intended interpretation is that lexically the field is 64 bit long but the value space is constrained to 63 bit (so the length can be represented in a 64 signed integer, with the sign bit always zero). It is a bit unconventional to use ABNF like this, but it is explained in Section 5.2.

5.4. Fragmentation
The primary purpose of fragmentation is to allow sending a message
that is of unknown size when the message is started without having to
buffer that message. If messages couldn't be fragmented, then an
endpoint would have to buffer the entire message so its length could
be counted before the first byte is sent. With fragmentation, a
server or intermediary may choose a reasonable size buffer and, when
the buffer is full, write a fragment to the network.
A secondary use-case for fragmentation is for multiplexing, where it
is not desirable for a large message on one logical channel to
monopolize the output channel, so the multiplexing needs to be free
to split the message into smaller fragments to better share the
output channel. (Note that the multiplexing extension is not
described in this document.)
Unless specified otherwise by an extension, frames have no semantic
meaning. An intermediary might coalesce and/or split frames, if no
extensions were negotiated by the client and the server or if some
extensions were negotiated, but the intermediary understood all the
extensions negotiated and knows how to coalesce and/or split frames
in the presence of these extensions. One implication of this is that
in absence of extensions, senders and receivers must not depend on
the presence of specific frame boundaries.
The following rules apply to fragmentation:
o An unfragmented message consists of a single frame with the FIN
bit set (Section 5.2) and an opcode other than 0.
o A fragmented message consists of a single frame with the FIN bit
clear and an opcode other than 0, followed by zero or more frames
with the FIN bit clear and the opcode set to 0, and terminated by
a single frame with the FIN bit set and an opcode of 0. A
fragmented message is conceptually equivalent to a single larger
message whose payload is equal to the concatenation of the
payloads of the fragments in order; however, in the presence of
extensions, this may not hold true as the extension defines the
interpretation of the "Extension data" present. For instance,
"Extension data" may only be present at the beginning of the first
fragment and apply to subsequent fragments, or there may be
"Extension data" present in each of the fragments that applies
only to that particular fragment. In the absence of "Extension
data", the following example demonstrates how fragmentation works.
EXAMPLE: For a text message sent as three fragments, the first
fragment would have an opcode of 0x1 and a FIN bit clear, the
second fragment would have an opcode of 0x0 and a FIN bit clear,
and the third fragment would have an opcode of 0x0 and a FIN bit
that is set.
o Control frames (see Section 5.5) MAY be injected in the middle of
a fragmented message. Control frames themselves MUST NOT be
fragmented.
o Message fragments MUST be delivered to the recipient in the order
sent by the sender.
o The fragments of one message MUST NOT be interleaved between the
fragments of another message unless an extension has been
negotiated that can interpret the interleaving.
o An endpoint MUST be capable of handling control frames in the
middle of a fragmented message.
o A sender MAY create fragments of any size for non-control
messages.
o Clients and servers MUST support receiving both fragmented and
unfragmented messages.
o As control frames cannot be fragmented, an intermediary MUST NOT
attempt to change the fragmentation of a control frame.
o An intermediary MUST NOT change the fragmentation of a message if
any reserved bit values are used and the meaning of these values
is not known to the intermediary.
o An intermediary MUST NOT change the fragmentation of any message
in the context of a connection where extensions have been
negotiated and the intermediary is not aware of the semantics of
the negotiated extensions. Similarly, an intermediary that didn't
see the WebSocket handshake (and wasn't notified about its
content) that resulted in a WebSocket connection MUST NOT change
the fragmentation of any message of such connection.
o As a consequence of these rules, all fragments of a message are of
the same type, as set by the first fragment's opcode. Since
control frames cannot be fragmented, the type for all fragments in
a message MUST be either text, binary, or one of the reserved
opcodes.
NOTE: If control frames could not be interjected, the latency of a
ping, for example, would be very long if behind a large message.
Hence, the requirement of handling control frames in the middle of a
fragmented message.
IMPLEMENTATION NOTE: In the absence of any extension, a receiver
doesn't have to buffer the whole frame in order to process it. For
example, if a streaming API is used, a part of a frame can be
delivered to the application. However, note that this assumption
might not hold true for all future WebSocket extensions.

Notes:

There is no indication or mention of the payload length of the frame with regards to fragmentation. In abstract, it's apparent that the payload length specified in the header of the frame corresponds to the actual frames payload length. However, implementations have been observed that allow the headers payload length to be specified as a higher length than the actual raw payload of that frame. In the event that this occurs, some implementations are reallocating memory to support the length of what is reported as the entire payload of all fragmented messages combined, and continue building the buffer off of each frame from that point forward by allocating enough memory for specified payload length. Obviously, this opens up the potential for a memory consumption DoS attack on an implementation that uses this method. A lack of specification with regards to the RFC allows for such a mistake to happen, as it is not clearly defined in the fragmentation section.

The RFC specifies that fragmented messages should be used to send messages of an unknown length, i.e. streaming media, and therefore should also specify that the frame length bit should be preserved to the length of only that current frame and not be used to specify the overall payload length of all fragmented messages combined. Implementations should be forced to work with them on a frame-by-frame basis and to drop the connection of any instances where the specified length of the frames payload does not match the length of the actual raw payload data for that one frame.
--VERIFIER NOTES--
Thanks, Jonathan, for some useful comments. As we discussed, these aren't appropriate for the errata system, but should be discussed on the relevant mailing list for possibly inclusion in an update or follow-on to the document. That list is <hybi@ietf.org>.