What is JWT?

In essence it’s a signed piece of data in JSON format. Because it’s signed the recipient can verify its authenticity. Because it’s JSON it weights very little. If you are after the formal definition, it’s in the RFC 7519.

This article was featured on Hacker News. Have a look at the case study exposing analytics, SEO impact, performance hit and more.

Signed data is nothing new - what’s thrilling here is how JWT can be used to create truly RESTful services, with no sessions. As it turns out the the idea has been around for a while. Here’s how it works in physical world - I will draw the analogies straight after:

Imagine you are coming back to your country from holidays abroad. You are at the border and you say - you can pass me through, I’m a homie. All fine and dandy but how can you support your claim? Most probably you are carrying a passport confirming your identity. Let’s assume the border staff has all that is required to tell for sure if the passport is genuinely issued the Passport Office of your country. The passport proves to be in order and they let you through.

Now, let’s look at the story from JWT perspective to see who is who:

The Passport Office - authentication service which issued the JWT.

Passport - your JWT signed by the Passport Office. Your identity is readable to everyone who looks at it but interested parties can verify if it’s genuine.

Citizenship - your claim contained in the JWT (your passport).

Border - security layer in your app verifying the JWT token before granting access to a secured resource, in this case - the country.

Country - resource you want access to (e.g. API).

Look ma, no session!

In very simple terms, JWT are cool because you don’t need to keep session data on the server in order to authenticate the user. The workflow goes like so:

The user calls authentication service, usually sending username and password.

The authentication service responds with a signed JWT, which says who the user is.

The user requests access to a secured service sending the token back.

Security layer checks the signature on the token and if it’s genuine the access is granted.

Let’s think for a moment about the consequences.

No session storage

No sessions means no session storage. Sounds like not much unless your applications needs to scale horizontally. If your application is run on multiple servers then sharing the session data becomes a burden. You either need a specialised server just for session storage or shared disk space or sticky sessions on the load balancer. None of that is needed when you don’t use sessions.

No garbage collection for sessions

Usually sessions need to be expired and garbage collected. JWT can carry its own expiry date along with the user data. Therefore the security layer checking JWT’s authenticity can also check the expiry time and simply refuse access.

Truly RESTful services

Only without sessions you can create truly RESTful services, as they are supposed to be stateless. JWT is small so it can be sent with each request, just like a session cookie. Unlike the session cookie however, it does not point to any data stored on the server, JWT has the data.

How does JWT look, exactly?

The claims in a JWT are encoded as a JSON object that is used as the payload of a JSON Web Signature (JWS) structure or as the plaintext of a JSON Web Encryption (JWE) structure

The former gives us just the signature and the data it contains (or the “claims” as they call it in JWT nomenclature) is readable to anyone. The latter offers encryption, so only someone with a key can decrypt it. The JWS implementation is much easier and the basic usage does not require encryption - after all if you have a key on the client you may as well leave the whole thing unencrypted. Therefore JWS is used in most cases and so I am going to focus on it here.

So what goes into the JWT/JWS?

Header - information about the signing algorithm, the type of payload (JWT) and so on in JSON format.

Payload - the actual data (or claims if you like) in JSON format.

Signature - well… the signature.

I will explain the details later on. For now let’s analyse the basics.

Each part mentioned above (header, payload and signature) is base64url-encoded, then they are glued together with a . to form JWT. Here is how the implementation could look like:

The resulting JWS looks neat and sweet and somewhat like this:eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJuYW1lIjoiSm9obiBEb2UiLCJhZG1pbiI6dHJ1ZX0.OLvs36KmqB9cmsUrMpUutfhV52_iSz4bQMYJjkI_TLQ

You can play around with creating the tokens on jwt.io website.

Quite an important thing is that the the signature is calculated for both the header and the payload in one go. Thus the authenticity of the header and the payload can be easily checked in one go as well:

What can be in the JWT header?

As a matter of fact, the JWT header is called JOSE header. JOSE stands for JSON Object Signing and Encryption. As you would expect, both JWS and JWE have such a header, however each has slightly different set of registered parameters. Below’s the list of the header parameters registered for JWS. All except the first one (alg) are optional:

alg Algorithm (compulsory)

typ Type (for JWT it has a value JWT, if present)

kid Key ID

cty Content Type

jku JWK Set URL

jwk JSON Web Key

x5u X.509 URL

x5c X.509 Certificate Chain

vx5t X.509 Certificate SHA-1 Thumbprint

x5t#S256 X.509 Certificate SHA-256 Thumbprint

crit Critical

The first two are the most commonly used, therefore the typical header looks somewhat along these lines:

1

{
"alg": "HS256",
"typ": "JWT"
}

The third header parameter listed above, kid, proves to be handy for security reasons. cty on the other hand should be used only when dealing with nested JWTs. The rest of them you can read upon in the spec, as I believe they don’t fit in the scope of this post.

alg (algorithm)

The value of the alg parameter can be anything specified in JSON Web Algorithms (JWA) - yet another spec, I know. Here’s the registered list for JWS:

HS256 - HMAC using SHA-256

HS384 - HMAC using SHA-384

HS512 - HMAC using SHA-512

RS256 - RSASSA-PKCS1-v1_5 using SHA-256

RS384 - RSASSA-PKCS1-v1_5 using SHA-384

RS512 - RSASSA-PKCS1-v1_5 using SHA-512

ES256 - ECDSA using P-256 and SHA-256

ES384 - ECDSA using P-384 and SHA-384

ES512 - ECDSA using P-521 and SHA-512

PS256 - RSASSA-PSS using SHA-256 and MGF1 with SHA-256

PS384 - RSASSA-PSS using SHA-384 and MGF1 with SHA-384

PS512 - RSASSA-PSS using SHA-512 and MGF1 with SHA-512

none - No digital signature or MAC performed

Please note the last one, none, which is the most interesting from the security perspective. It’s been known to be used for a downgrade attack angle. How does it work? Imagine a JWT is generated by the client with some made up claims. It specifies the none signature algorithm in the header and it sends it for verification. If the issuer is naive it takes the alg parameter as true and grants access where it shouldn’t.

The baseline is, the security layer of your app should always be suspicious about the alg parameter from the header. That’s where the kid comes handy.

typ (type)

This one’s pretty straightforward. If it’s known that it is a JWT, because the application does not expect anything else, this parameter would be ignored. Therefore it’s optional. If specified though, it should be spelled in capitals - JWT.

In some cases when the app accepts non-JWTs containing JWT it’s important to specify it, so that the app does not go bonkers.

kid (key id)

If the security layer in your app uses just one algorithm for signing the JWTs, you don’t have to worry about the alg parameter, because you are always checking integrity of the token with the same key and algorithm. If however, your app uses a bunch of different algorithms and keys, you need to be able to figure out which the token was signed with.

As we saw earlier, relying on the alg parameter alone may lead to some… inconveniences. However, if your application maintained a list of key/algorithm pairs, and each of the pairs had a name (id), you could add that key id to the header and then during verification of the JWT you would have more confidence in picking the algorithm. That’s what goes into the kid header parameter - the id of the key your app used to sign the token. The id is arbitrary and it’s up to you to assign it. What’s most important here - you gave the id, so you can verify it.

cty (content type)

The spec is pretty clear here, so I will just quote:

In the normal case in which nested signing or encryption operations are not employed, the use of this Header Parameter is NOT RECOMMENDED. In the case that nested signing or encryption is employed, this Header Parameter MUST be present; in this case, the value MUST be “JWT”, to indicate that a Nested JWT is carried in this JWT. While media type names are not case sensitive, it is RECOMMENDED that “JWT” always be spelled using uppercase characters for compatibility with legacy implementations.

What can be in the JWT claims?

Doesn’t the name “claims” bother you? It did bother me at first. I believe you need to repeat that a few times in order to get used to it. In simple terms, the claims are the meat of the JWT - this is the data that we care so much about as to sign it. It’s called “claims” because usually that’s what it is - the client claims that the username, user role or whatever else is such that it would grant him access to the resources he’s after.

Remember that lovely story I told you at the beginning? Your citizenship was the claim and your passport - the JWT.

You can put whatever you wish in the claims, there is however a registered list, which should be universally recognised amongst implementations. Please note that each of them is optional and processing for most of them is application specific. Here’s the list:

exp - Expiration Time

nbf - Not Before

iat - Issued At

sub - Subject

iss - Issuer

aud - Audience

jti - JWT ID

It’s worth noting that except for the last three (issuer, audience and JWT ID) are usually used in more complex cases, e.g. with multiple issuers. Let’s get on with it, then.

exp (expiration date)

Timestamp indicating when the token becomes invalid. The spec says that “the current date/time MUST be before” the value specified in the exp claim in order to allow processing of the token. It is also indicated that some leeway (a few minutes) is allowed in order to account for clock skew.

nbf (not before)

Timestamp indicating when the token becomes valid. The spec says that “the current date/time MUST be after or equal” the value specified in the nbf claim in order to allow processing of the token. It is also indicated that some leeway (a few minutes) is allowed in order to account for clock skew.

iat (issued at)

Timestamp indicating when the token has been issued.

sub (subject)

As the spec says “the claims in a JWT are normally statements about the subject”. Subject must be unique within the context of the issuer or globally unique. The sub claim can be used to identify the user, for example JIRA does that.

iss (issuer)

String value identifying the issuer of the token. If the value contains : it has to be an URI. It can be useful if there are many issuers with one security layer and the application needs to identify the issuer. For example Salesforce require to use OAuth client_id as the iss value.

aud (audience)

String or an array of strings identifying the intended recipient(s) of the token. If the string contains : it has to be an URI. Often used as an URI of the resource for which the claims are valid. For example, in OAuth the audience is the authorisation server. The application processing the token must verify that the audience is correct or reject the token if it is intended for different audience.

jti (JWT id)

Unique identifier for the token. This value must be unique for each issued token, even if there are many issuers. The jti claim can be used for one-time tokens, which cannot be replayed.

How to use JWT in my application?

In the most common scenario, the client in the browser will authenticate in the authentication service and receive JWT in return. Then the client stores the token somehow (e.g. memory, localStorage) and sends it back with every request for a protected resource. Usually the token is sent as a cookie or Authorization header in HTTP request:

The header method is preferred for security reasons - cookies would be susceptible to CSRF (Cross Site Request Forgery) unless CSRF tokens were used.

Secondly, the cookies can be sent back only to the same domain (or at most second level domain) they were issued from. If the authentication service resides on a different domain, cookies require much more wild creativeness.

How to log out with JWT?

Because there is no session date stored on the service side, logging out cannot be performed by destroying the session. Therefore logging out is the client’s responsibility - as soon as the client forgets the token it cannot be authorised any more and therefore can be considered logged out.

Conclusion

I think JWTs are a very clever way of authorising without sessions. They allow for creating truly RESTful services with no state remembered on the service side, meaning no session storage is required either.

Unlike session cookies automatically sent by the browser to any URL matching the domain/path combination (let’s be honest, it’s just the domain in most cases), JWTs can be selectively sent only to the resources requiring authorisation.

The implementation is pretty simple, both on the client and on the server, especially that there are specialised libraries for signing and verifying the tokens.

Thanks for reading!

If you like this article please share it. Your comments are very welcome!