Andy "Krazy" Glew is a computer architect, a long time poster on comp.arch ... and an evangelist of collaboration tools such as wikis, calendars, blogs, etc. Plus an occasional commentator on politics, taxes, and policy. Particularly the politics of multi-ethnic societies such as Quebec, my birthplace.

The content of this blog is my personal opinion. It is not that of my employer. See Disclaimer.

Photo credit: http://docs.google.com/View?id=dcxddbtr_23cg5thdfj

Disclaimer

The content of this blog is my personal opinion only. Although I am an employee - currently of Nvidia, in the past of other companies such as Iagination Technologies, MIPS, Intellectual Ventures, Intel, AMD, Motorola, and Gould - I reveal this only so that the reader may account for any possible bias I may have towards my employer's products. The statements I make here in no way represent my employer's position, nor am I authorized to speak on behalf of my employer. In fact, this posting may not even represent my personal opinion, since occasionally I play devil's advocate.

See http://docs.google.com/View?id=dcxddbtr_23cg5thdfj for photo credits.

(New user in grc.securitynow. Longtime podcast listener. Very long time
ago USEnet user (not so much nowadays). My apologies if this is a FAQ.)

OK, so there's a trend to encrypt all traffic - to use https, to
discourage http. If for no other reason than to make man-in-the-middle
attacks harder.

One of the big losses is caching: the ability for somebody like a school
in a bandwidth deprived part of the world (like Africa, now; like
parts of Canada, when I grew up, although no longer so true) to cache
read-only pages that are used by many people. Like the website I used
to run, and which I hope to bring back up sometime soon - a hobbyist
website for computer architects. No ads. No dynamic content.

Heck, like this newsgroup would be, if it were presented as webpages.

HTTPS encryption, with a different key for each session, means that you
can't cache. Right?

Q: is there - or why isn't there - an HTTPS-like protocol where the
server signs the data, but where the data is not encrypted?

(I thought at first that the null cipher suite in HTTPS / TLS was that,
but apparently not so.)

Having the server sign the data would prevent man-in-the-middle
injection attacks.

An HTTPS-like handshake would be needed to perform the initial
authentication, verifying that the server is accessible via a chain of
trust from a CA you trust. (Bzztt.... but I won't rant about web of
trust and CA proliferation.)

Possibly you might want to encrypt the traffic from user to server,
but only sign the traffic from server to user.

So, why isn't this done?

It seems to me it would solve the "HTTPS means no caching" problem.

OK, possibly I can answer part of my own question: signing uses the
expensive public key cryptography on each and every item that you might want to
sign. Whereas encryption uses relatively cheaper bulk encryption,
typically symmetric key protocols like AES.

Signing every TCP/IP packet might have been too expensive back in the early days
of the web. Not to mention issues such as packet fragmentation and recombining.

But note that I say "each and every item that you want to sign".
Perhaps you don't need to sign every packet. Perhaps you might only
sign every webpage. Or every chunk of N-kiB in a web page.

A browser might not want to start building a webpage for display until
it has verified the signature of the entire thing. This would get in
the way of some of the nice incremental fast rendering approaches.

But, perhaps the browser can incrementally render, just not enable
Javascript until the signature has been verified? Or not allow such
Javascript to make outgoing requests? I am a computer architect: CPU
hardware speculatively executes code befopre we know it is correct, and
cancels it if not verified. Why shouldn't web browsers do the same?

I.e. I don't think latency of rendering should be an obstacle to having
cacheable, signed but not encrypted, HTTPS-like communication.

Probably the plain old computational expense would be the main
obstacle. I remember when just handling the PKI involved in opening an
SSL connection was a challenge for servers. (IIRC it was almost never a
challenge for clients, except when they opened too many channels to try
to be more parallel.) What I propose would be
even more.

But:

(1) CPUs are much faster nowadays. Would this still really be a
problem?

+ I'm a computer architect - I *love* it when people want new
computationally demanding things. Especially if I can use CPU
performance (or GPU, or hardware accelerator) performance, which is
relatively cheap, to provide something with social value, like saving
bandwidth in bandwidth challenged areas of the world (like Africa - or,
heck, perhaps one day whden the web spans the solar system).

(2) Enabling caching (or, rather, keeping caching alive) saves power -
now I mean power in the real, Watt-hours, sense, while requiring
signatures and verifying them consumes CPU cycles. I am not sure that
the tradeoff prohibits what I propose.

1 comment:

A bit late, but there are some misconceptions in this post that shift the tradeoff around - in enough directions that I'm not sure what the final location of the point is :P

1.) The big reason to push pervasive encryption (not just HTTPS, but DNScrypt, SMTP over TLS, IMAP over TLS, etc.) is not just "Prevent MITM"

1.1.) The difference between "private" and "secret" is a fuzzy one, but both need encryption rather than just authenticity - and passive-taps suffice against auth-only; with the NSA's avowed tendency towards slurping EVERYTHING, this is a notable thing.

1.2.) The server and the client may disagree on what is and is not private.

2.1.) CDNs can be located near the end-user, and use TLS on a shorter hop. The vast majority of content by bandwidth is essentially static (videos, etc), and CDN-as-a-service is an ancient business.

3.) Authentication-only does _not_ require public-key operations on each message

3.1.) You do the the asymmetric handshake as normal, then use a MAC (Message Authentication Code, can be built out of any secure hash using HMAC) to authenticate each message. This is fast - approximately the same speed as a raw hash, plus a small constant factor.

4.) Using authentication-only modes may _not_ improve cacheability

4.1.) The HTTP response over the wire often contains headers that are variant with client, even if the actual data is invariant. In such a case, the MAC of the individual messages would vary, and cannot be used across multiple messages.

4.2.) As a result, you'd need to do the authentication at a higher level - within each protocol such as HTTP - to do it separately on the headers and the bodies. This is a fraught endeavor.

Overall, I _think_ this moves the balance mostly towards "The costs outweigh the benefits", but it's certainly not black and white. It's also certainly not simple to just _try_ it and see, sadly.