Application Layer Protocols

Application layer protocols provide interface methods for applications to allow them to communicate over computer networks. The application layer is described in a similar way by both, TCP/IP and OSI network models. Their specifications differ only in details.

HTTP

HTTPS

SSL/TLS

IRC

Hypertext Transfer Protocol

HTTP is the most important application layer protocol of all that are used in the Internet. It allows to request data from servers (websites), and send data to servers.

HTTP is as old as the Internet itself. The first version of the protocol (HTTP/0.9) allowed only to request HTML (HyperText Markup Language) pages. Two new HTTP versions, HTTP/1.0 and HTTP/1.1, were published in 1996 and 1997 respectively. The latest documentation of HTTP/1.1 consists of six specifications: RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, and RFC 7235. The most popular version of the protocol is the latest one, HTTP/1.1. Some older or simpler browsers use HTTP/1.0. The earliest version, HTTP/0.9, is used very rarely (however it is still supported by many applications).

Another version of the protocol (HTTP/2) was officially published as RFC 7540 in May 2015.

HTTP Overview

HTTP is as a request-response protocol between a client and a server. Every HTTP session is initiated by creating a TCP connection to the server (usually to port 80) and then sending a request with a desired URL address. The server returns the requested file (or other content) and, usually, terminates the connection.

In HTTP/1.1 just one connection is used to download a page file and all other files needed to display its context (images, stylesheets, scripts). There is no need, as in the previous versions, to establish a connection for every file separately. However, the HTTP protocol itself is connectionless - after sending all requested data, the server ends the connection between it and the client.

What is more, HTTP it stateless. This means that neither the client nor the server stores information about each other between different requests. Finally, HTTP is media independent. Any data can be transmitted as long as both the client and the server know how to handle it. The data type has to be specified using an appropriate MIME-type.

HTTP Protocol Design

The most common ports used for the protocol are port 80 and sometimes port 8080. An HTTP server listens on the port for a client's request message. After receiving the request, the server sends back an acknowledgement message and then a response message. The response usually contains the requested file, though the server may also return an error or other information.

Messages in HTTP/0.9

In HTTP/0.9 the requests sent to the server were quite simple. They contained only the GET keyword followed by the URL path (with a query string), and ending with a newline character. A simple HTTP/0.9 request may be similar to the one below:

GET http://www.example.com/directory

The server should immediately return the requested document (ASCII text with HTML data or plain text prefixed with PLAINTEXT). The server doesn't hold any state related to the browser.

Messages in HTTP/1.0 and HTTP/1.1

Both modern protocol versions, HTTP/1.0 and HTTP/1.1, allow a user to specify additional data in a request. First of all, there are more methods available:

Some of these methods are considered to be safe, they can be used only for retrieving information from a server (HEAD, OPTIONS, GET, TRACE). On the other hand there are a few methods which can cause side effects on the server or even on external sites (DELETE, PUT, POST). In practice handling of GET requests is not limited in any way, thus this particular method should be also considered as dangerous.

The first line of a request message contains the request method name, together with the URL path, and protocol version information. It may look similar to that one:

POST /first_directory/second_directory/doit.php HTTP/1.1

The first line can be followed by some more lines called headers. Each header contains one name-value pair (the name and the value are separated by a colon). There are a few common request headers included in most client's requests:

The server responds by sending its own message. The message starts with the Status-Line (that contains, as mentioned above, the protocol version, a numerical Status-Code, and a human-readable Reason-Phrase).

Security of HTTP

The biggest weakness of HTTP is the fact that the communication between the client and the server is not encrypted. All messages can be easily eavesdropped on their way from one computer to another. HTTPS is the most popular method of establishing secure HTTP connections.

It seems to be worth mentioning that an intruder can also use HTTP request messages with the enabled method TRACE, to gather information related to the attacked system. Sometimes it may be recommended to disable the HTTP TRACE method using mod_rewrite.

Security issues caused by different implementations

There are a few problems related to HTTP security caused by the fact that different server and client applications provide different type of support for older versions of the protocol.

For example, the specification of HTTP/1.0 describes that HTTP/1.0 clients must be able to handle valid responses in both HTTP/0.9 and HTTP/1.0 protocol formats. Although in some other documents one can find different approaches, most browsers support HTTP/0.9 and thus they are vulnerable for attacks performed as shown below:

Other kinds of problems are related to the fact that HTTP/1.1 allows to encode a newline character as a single CR character. At present, most web servers (for example Apache or IIS) do not fulfil this requirement and do not treat a single CR as a newline character. However, most browsers do recognize it as a newline (Internet Explorer, Safari, Opera). Therefore the messages containing CR characters may be interpreted and handled in different ways by various browsers and servers.

The similar issue is caused by the concept of multiline headers which was introduced in HTTP/1.1. The idea is that if a header line begins with a whitespace, it should be treated as a continuation of the previous header line. Again, most web servers (included Apache or IIS) handle multiline headers while most browsers (including Internet Explorer, Safari, Opera) don't. Again, the same message with multiline headers can be treated in different ways.

The HTTP specification doesn't specify how web servers and client applications should treat duplicate or ambiguous headers. As one could expect, different software handle them in different ways (accept first, accept last, discard all).

Security of HTTP, like security of the whole Internet, suffers from the way in which web structure and ideas were developed - a lot of vendors and producers, incompatibility, features often developed bottom-top, and lack of concern about security and privacy at the beginning of Web.

Hypertext Transfer Protocol Secure

HTTPS allows to browse the Internet using the HTTP protocol with additional encryption. The first version of HTTPS was created in 1994 by Netscape Communications.

All the HTTP content is encrypted by the SSL or TLS algorithms, which operate also on the Application Layer but below the HTTP itself. The attacker who overhear the communication is able to see only the SSL or TLS frames with some encrypted data. The underlying SSL/TLS algorithm typically uses long-term public and private keys to generate a symmetric key for each communication session.

Encryption

As it was mentioned before, the whole HTTPS algorithm consists in fact of two algorithms, HTTP and SSL/TLS. It operates over a Transport Layer protocol, usually TCP. The TCP connection is not encrypted, which means that the IP addresses and port numbers are visible to others. Thus, the attacker may know the domain name but won't be able to determine the full URL path.

The strength of the encryption depends strictly on the underlying SSL/TLS protocol.

It is recommended to use algorithms that provide perfect forward secrecy. This means that if an attacker posses one of the long-term asymmetric secret keys (used to create the HTTPS session), he (or she) won't be able to derive the short-time session key, and to decrypt the actual messages. Not all asymmetric algorithms provide such security. Two known algorithms that do provide it are Diffie-Hellman and Elliptic curve Diffie-Hellman key exchange algorithms.

Authentication

What is more, HTTPS provides website authentication. This protects against man-in-the-middle attacks. Web servers can be authenticated by proving certificates to web browsers. The web browsers contain the certificates of major authorities (such as Symantec, Comodo, GoDaddy, etc.) already pre-installed and are able to determine the true authenticity of the other side.

When a web browser fails to authenticate the web server it tries to connect to (for example due to an invalid certificate), it will usually display a proper warning to the user. Also, encrypted connections are usually presented with some kind of a green lock icon located somewhere near to the URL bar.

HTTP vs. HTTPS

The HTTPS URL is the same as the HTTP one, aside from its scheme token (https:// vs. http://). As mentioned above, most web browsers indicate usage of HTTPS by showing an icon of a green padlock.

HTTPS uses a different port number than the ordinary HTTP protocol: 443 by default, as opposed to port 80 used by HTTP.

To make sure that communication using HTTPS is secure, it is necessary that all content of a web-page (that means not only text but also images, scripts, etc.) is loaded over HTTPS. To avoid surveillance and various types of attacks, strictly no data should be provided over the HTTP protocol.

Secure Socket Layer and Transport Layer Security

Both Transport Layer Security (TLS) and Secure Sockets Layer (SSL) refer to the same set of Application Layer protocols. They are used for protecting data exchanged by other Application Layer protocols.

SSL was originally developed in a company called Netscape. There are three versions of SSL protocol, invented in 1994, 1995 and 1996 years respectively. The first version of TLS was presented in 1999, as an improvement of the existing SSL 3.0 protocol. After 1999, two other TLS versions have been officially released: TLS 1.1 in 2006 and TLS 1.2 in 2009. The third version, TLS 1.3, is currently being prepared (as for 2016) and is due to be released soon. Generally, all the newer TLS and SSL versions were introducing new more reliable cryptographic algorithms, whereas the older and insecure versions were being removed. The protocol name changed from SSL to TLS to avoid potential legal issues from Netscape.

Nowadays, public and private keys using by TLS/SSL asymmetric algorithms contain thousands of bits. There exit a few popular implementations of TLS/SSL protocol for major programming languages and operating systems. OpenSSL is perhaps the most popular one.

TLS/SSL protocols operate on the Application Layer under other Application Layer protocols and they are supposed to protect the messages exchanged by the later ones. Currently, TLS/SSL protocols are used to secure all major web functionalities:

TLS/SSL usually cooperates with the reliable TCP protocol operating on the Transport Layer. However, there exist also implementations that work with other Transport Layer protocols, including the unreliable ones like UDP.

TLS messages are called records. Each record contains several control fields which describe the protocol version, the message type, the message length, etc. The control fields are followed by the actual data, and then by the (optional) message MAC and the (also optional) padding bytes.

Encryption

Before establishing the connection, both sides negotiate the encryption parameters during so called TLS handshake protocol. They must agree which encryption algorithm will be used and create proper cryptographic keys. The encryption used later for securing all messages is symmetric and usually the negotiated symmetric key is valid only for the time of one session.

The process of establishing the shared secret key is secure and the eavesdropper cannot obtain it even if he intercepted all the messages exchanged between the client and the server. What is more, the handshake protocol guarantees that the negotiated secret key was intact during transmission by the intruder, that is, that the communication is reliable.

The whole process of establishing the secure connection is protected against man-in-the-middle attacks.

Authentication

Both sides may authenticate themselves before creating the session. The authentication is performed by using the digital certificates signed by trusted third parties and asymmetric encryption with public and private keys.

The authentication step is optional and one or both sides may not require it. Usually, for convenience reasons, only the server authenticate itself.

The client may authenticate the other side by using the other side's public key (available from the certificate received from trusted Certificate Authorities) to decrypt some information encrypted earlier by the other side by using the corresponding private key. If the information can by properly decrypted, then the client should assume that the other side can be trusted.

Message Integrity

The whole communication protected by TLS/SSL is reliable and the protocol itself checks the integrity of all received messages.

The integrity checks are based on message authentication codes attached to all messages. They are supposed to secure the messages against damages and alteration.

Similarly to other TLS/SSL functionalities, message integrity may also be provided by various different cryptographic algorithms, depending on the client and server capabilities.

Handshake Protocol

The handshake procedure begins just after the sides agreed to use TLS. The client and the server choose all the parameters of the secure connection they are going to create.

If any of the steps described above fails (on either side), the connection is cancelled. The second phase of communication, the record protocol will not be started.

Due to the fact that session negotiating by using an asymmetric encryption algorithm is a rather expensive procedure, then instead of creating a new symmetric key, either side may try to resume the previously used session. If the other side accepts that, they will use the secret keys created for the previous session.

Security of TLS/SSL

The secure TLS/SSL connection may be configured to use various underlying symmetric and asymmetric encryption algorithms. The strength of the protection depends strongly on the selected cipher and its implementation.

The two first SSL protocol versions are generally considered to be unsafe, whilst the third SSL version is comparable to TLS 1.1. As opposite to that, the newer TLS versions are much more refined and provide much better security. Although there exist several attacks targeting various TLS algorithm implementations, it is considered to be a strong and efficient tool for providing security during communicating over computer networks.

It is recommended to create secret keys by algorithms which provide perfect forward secrecy. That guarantees that private keys compromising (that belong for example to trusted Certificate Authorities) will not compromise the privacy of all communications protected by the derived private keys. Certificate Authority organisations were recently targeted by many attacks which led to disclosure of many long-term private keys and compromised many digital certificates.

Internet Relay Chat

IRC is an application layer protocol which allows to exchange text messages between users.

The protocol was created in 1988 by a Finnish software engineer, Jarkko Oikarinen. It was designed mainly for group communication via various discussion forums called channels but the protocol allows also to send and receive private messages or data.

IRC Overview

IRC works in client/server model. At first, every user has to install a client application. Using the client application, it is possible to send text messages to the IRC server, which transfers messages to other clients. The servers are connected together and form larger groups, so they can exchange messages between themselves.

There are several IRC services that provide some additional functionalities, like bots (sending messages generated by computer programs to channels) or bouncers (daemon processes that provide IRC communication to offline users or to computers without any IRC client installed).

The image below presents an example of the IRC network:

IRC Protocol Design

Usually IRC runs over the TCP protocol. The official TCP port assigned to IRC is 194, however to avoid having to run the server application with root privileges, the most common port to run IRC is 6667/TCP and a few other ports nearby (6660-6669 and 7000).

IRC specification is covered by several documents, RFC 1459 and a couple of later ones: RFC 2811, RFC 2812, and RFC 2813. However, most client and server applications don't follow the design strictly.

IRC was used originally only for sending text messages. Each character was encoded using 8 bits, without specifying the type of encoding. This could cause problems when conversing users were using different encoding. At present, UTF-8 is the most popular encoding used in IRC messages and it is supported by most IRC applications.

IRC users communicate with server and other users by sending simple text commands. Every command specifies who is the recipient (a server, a channel or another user) and additional parameters like the text of the message.

Security of IRC

The original design of IRC is insecure. Most servers don't require users to register an account and usually people can choose nicknames just before connecting to the channels.

Every process of changing the network structures is usually problematic and it may cause various issues (for example, because of several users having the same nicknames not necessarily with the same privileges). Also, it is assumed that servers trust one another during exchanging messages. A server that behaves incorrectly can cause problems to the whole network.

In the early 2000s some IRC networks were often attacked using DDoS and other more sophisticated attacks. This caused many users migrated to different IRC networks or abandoned that way of communication completely.

The limitations of the protocol are well known, and therefore improvements are often introduced in modern implementations. A lot of IRC servers have already started to support secure SSL/TLS connections.

IRC Today

IRC was the most popular in 2003. It is estimated that it was using by over one million people on hundreds of thousands of channels. Nowadays, the number of users have decreased to less than half a million in 2014. The reasons why people use IRC applications have also changed.

At the beginning, the IRC networks were used for social networking, however now websites like Facebook or Twitter took over these functions. People used to use IRC networks to broadcast unofficial or illegal news and information. At present, there are much better ways to do it (like TOR). IRC channels were used to exchange information about piracy software and warez. Nowadays, bad guys prefer to look for such information in other places, like P2P.

Due to commercialization of the Internet, a lot of companies have decided to invest money in their own products and to create their own ways of communication instead of using publicly available IRC. On the other hand, there are several IRC-based commercial or open source projects that are widely used by development teams and various firms and organizations for internal and external communication.

IRC is a very old protocol and it has been using for many years. The way of using the protocol has changed over that time. One may predict that IRC technology will be still used in various applications and services, at least over the next several years.

At present, IRC client applications are widely available for all major operating systems (mIRC, irssi, HexChat) and browsers (Mibbit, KiwiIRC). Most web browsers provide add-ons that allow to use IRC. The most popular server application is perhaps IRCd.

Before 1990s, when IRC was used mostly in Scandinavian countries, 7-bit encoding was the most popular one. It contained some quite pretty non-ASCII letters like ä, ö, and å.

One may mention Freenode, a free network for open source software communities. The other example could be Campfire, a commercial web-based application that provides secure chat rooms and transmission of files and images. A bit different project, named Hubot is also worth mentioning.