I received an enthusiastic message off the contact form this morning from someone writing an online CSS optimization tool. So I had a look.

The idea of compressing CSS/HTML isn’t a new one; around 2000, I spent a bit of time writing a “cruncher” for my bloated markup that would strip whitespace and extraneous characters, and I know I was even following in someone else’s footsteps then.

But the classic problem with markup optimizers is that they’re lossy, which results in two problems.

First, you’re saving only a fraction of the file size which was otherwise lost to unneeded characters. It’s not true compression, it’s only a shrinking of what gets sent over the wire. The savings aren’t substantial, especially in light of true server-side compression like gzip. And CSS gets cached anyway, the initial download is a one-time hit. Hard to argue that the 2k or so which is only saved on the first visit to a site, and not on subsequent visits, is worth it considering this next problem.

Second, it’s a one-way process; once you’ve stripped those characters, you’re not getting them back. CSS isn’t like GIF, in the sense that there’s a separate original file (PSD, PNG, AI, whatever) that you can easily edit if you need to change a GIF. The CSS file is the original; try editing one that has been through the compression process, you won’t like yourself very much for having done so. Post-production readability is not something to discard lightly.

With the cruncher I built (the Visual Basic source for which is long lost, so don’t bother asking), the second problem was an important one to solve otherwise I never would have bothered. Assuming the source markup followed a certain set of white space formatting rules (tabs for indentation, multiple tabs for multiple levels of indentation, etc.) you can programmatically re-create it after the fact. And despite some exceptions, I managed to get this working rather well. One button was for compressing, another was for de-compressing; a file going through that process won’t look exactly like the original, but it did look like something a human could edit.

What I find particularly interesting about this new compressor is that, being CSS-specific, it takes advantage of the cascade to combine like rules:

#header h2 {
font-size: 2em;
}
#sidebar h3 {
font-size: 2em;
}

The above code block would compress to something like so:

#header h2,#sidebar h3 {font-size:2em;}

64 bytes of code in the first example (including carriage returns) versus 39 bytes in the second example. The possible optimization really depends on selector lengths being reasonable though, as I could imagine longer selectors duplicated for different properties potentially causing the file size to grow, if they’re really messy. Still, in general, you might be able to shave off a quarter of your file size or more.

So how would one go about writing a good CSS and HTML optimizer that I might consider using? I’m not sure you could. As evidenced by the fact I lost interest in the one I was writing, even if you manage to gracefully reverse the process, the extra time it takes to de-compress/re-compress every time I edit the file invalidates the relatively small benefits of using one. And everyone viewing my source would think my return key broke.

If, however, you were to write a just-in-time server-side post-processor (and throw in a few more hyphenated words while you’re at it) that doesn’t break CSS caching, then we might have something. Consider what Shaun Inman did with CSS SSC — having a script parse my CSS file just as it’s about to get sent down the wire, so that the original CSS file I wrote doesn’t get compressed (and thus I can continue to edit it without having to do anything extra), seems like the only way this could work.

Questions still linger about the ultimate effectiveness, given gzipping, and whether the parsing overhead is worth it. But a proof-of-concept that puts those concerns to rest could be an interesting experiment. Any takers?

I want to know EXACTLY what CSS is being sent to the browser – given the silly hacks/filters, overrides and other tricks of the trade we end up using.

For that reason I will never buy in to any sort of CSS optimizer like this. GZIP is perfectly adequate and doesn’t change the structure of the file at all, and incredibly simple to implement (even if your server doesn’t have mod_gzip and the like).

I also don’t really see a point in compressing such small files even smaller. Especially when so many people are on broadband now. I also find the compressed code harder to edit. It takes longer to sift through the code to make changes when you eliminate so much white space. Still an interesting post though.

Sorry, forgot to ask, does anyone know what’s up with the w3 css validator. It got a lot stricter a couple days ago, spitting out errors and warnings for things like line-height: 0; and not specifying a color with a background color or vice versa. Would be a lot of extra css to get rid of those warnings.

I think what Jared is suggesting is not a server-side script that compresses the CSS from your editable file to a compressed version every time it’s requested, but instead a build process that you run when you upload new changes to said website. So, in the same way that you keep PSDs in some other directory somewhere else and create GIFs when you republish a new version of a site, you could keep clean nice CSS in that same directory and then run the compressor once, and upload the newly compressed file as a static file. I’m still not sure that it buys you much, though, as you point out.

With most compressors, as you’ve said, they just strip away some of the unneeded areas. Such as spaces and other areas of aesthetic coding. However, how on earth are we meant to read it once it’s been compressed? It’s just tidying it up, to make it harder to read. When we want to change it, we’ve got to find certain areas, and then still get confused later on.

I see little point in this, unless you’re CSS file is gigantic, and you may been some sort of real compression, like gzip, it’s not going to work. Even with the gzip, it will more than likely add additional server strain, which I’d rather not risk. In comparison with a few microseconds wait in loading.

Another benefit to server-side just-in-time processing is that you can roll all your CSS files into the one request, saving the overheads of multiple HTTP requests. Pipelining, may, if you’re lucky, reduce the problem, but the HTTP headers still need to be sent with each request.

To minimize server overhead in an on-the-fly compressor/obfuscator, why not cache the result? When the CSS is requested, check the last-modified date of the file (or hash it), and if it has changed, compress it and write it to disk. Otherwise, just serve up the cached file.

For those of you who don’t get the compression thing ( and reading the comments, there’s a number of you ), here’s how it works:

You install a module in your webserver ( generally mod_gzip for Apache 1.x or mod_deflate for Apache 2.x, IIS users you’re on your own ).

When a request comes through from a user for a compressable file ( & you can set which files can be - images are generally ignored), and the users browser says it can handle gzipped files, your webserver compresses the file and then sends it. The browser transparently decompresses it.

This ENTIRE process is hidden from the user, and is very safe since any browser which doesn’t send the correct ACCEPT-ENCODING header, will be sent uncompressed files.

Yes, you do incur a VERY minor performance hit due to compression, but the fact that the user gets the file quicker ( less to download ), and your server spends less time ( and bandwidth ) serving the file, makes this a VERY effective technique at saving you money and making your site more responsive.

How effective? On my site, it serves some large ( table-laden == lots of repetitive elements ) pages at around 12kb instead of 200kb.

You guys are trying to solve a problem that doesn’t exist. Have you ever heard ANYONE complain about the load times of CSS files? I haven’t, and I doubt I ever will. If you guys want to talk about saving bytes, find a way to get people to stop putting PDF files on the web. Compressing CSS is pretty pointless.

Rather than compressing and then decompressing your CSS file. It seems like a better solution would be to send your CSS through a build process that doesn’t replace the source, but creates a seperate ready-to-upload compressed version. Then, when you need to revise your CSS you don’t edit the compressed version that’s out there, you edit the source and then rebuild it before uploading. This eliminates the need for the decompression algorithm and the potential problem of slight variations each time you compress and then decompress your file.

I’m not sure I really see the value of the compressor these days - unless your CSS or HTML is enormous - at least 10’s, if not 100’s of KBs, no one’s really going to notice any change in their experience of the site. I suppose that you might see some savings in bandwidth costs if you use up enough, but even then, would you trust your obviously complex code to an automated script? Maybe it’s just the size of sites I work on (few get more than a few thousand visitors a day, only one gets 10’s of thousands), but I’ve only ever come close to exceeding my allowed bandwidth on sites that are image-heavy, and the optimization that would have made a difference was all in the image size. Would a slighly smaller CSS/HTML page-size make an appreciable difference to the bandwidth this site uses, as I imagine it uses a fair amount?

And I counter with “wha?” right back, and now we’re stuck in a loop. (What *I* mean with that “wha?” is: care to clarify your confusion?)

“It seems like a better solution would be to send your CSS through a build process that doesn’t replace the source…”

Right, that’s what I mean by a server-side script that processes the file in advance. The file you upload doesn’t change, but when it’s sent out, it’s first processed and compressed. The file that sits on the server != the file the end user sees.

“Would a slighly smaller CSS/HTML page-size make an appreciable difference to the bandwidth this site uses, as I imagine it uses a fair amount?”

That’s really the question. I suspect not, but if a server-side parser can make it all happen automatically, and there’s at least *some* non-trivial percentage of bytes saved, it may be worth another look.

It would be nice if Dreamweaver and other like apps would have an option to “compress/shrink” CSS, Javascript and or HTML upon upload. This would not affect the local files just the ones being uploaded to the server.

What Nick probably meant is that stripping all that stuff out is compression. Being lossy doesn’t change that.

Also, you can gzip and strip the same file and from my testing, if stripped file was x% smaller, gziped stripped file is around x% smaller than just gzipped and on some CSS files I’ve seen in the wild, this can amount to few kilobytes.

That doesn’t sound all that much, unless you’re using a slow connection like GPRS.

However, I absolutely agree it makes sense packing only deployed CSS and not the original file.

As GrumpySimon says, there’s no need to do clever things with the CSS - just turn on compression in your web server.

In effect, this makes the web server zip all files, send the browser the zipped file, which it in turn unzipps the file before using it. All this happens automatically, and it works today. Browsers that don’t support zipped transfers get the files in plain. (Well, of course, NS4 supposedly gets it wrong, but hey, what does it _not_ get wrong?)

Although I haven’t done this in a while, I used to have a makefile to “build” and upload sites in a manner similar to what Jared and Stuart suggested. For each file that had been modified since the last upload, the source would be sent through local Perl scripts that would strip whitespace and comments from markup, JavaScript, and CSS. If needed, the scripts would handle static includes. Then, the makefile would execute the FTP transfer to the server. It was a simple matter to run a single make command after making a few changes and I still had my original source intact for future edits. Nevertheless, the number of bytes saved was negligible.

There’s one reason I haven’t heard. It makes your css less readable for people. So if you don’t like to share your design with the rest of the world this might be a nice tool. Although it can be reversed it to much a trouble for most people.

Altough server side ‘shrinking’ isn’t of much use most of the times I believe server-side processing is very usefull.
Take CSS-SSC.

Be carefull with Gzip and style sheets. It is a good solution, I use it often. But. It was recently pointed out to me that one dead browser [1], although it accepts the gzipped file, has one little problem with this: it never caches the file. If the visitor only reads/looks (at) one page, it is not a big problem, but when the same visitor goes on to read many pages on your site, the benefits rapidly disappear. At least for that one browser.
Source: http://lists.over.net/pipermail/mod_gzip/2002-December/006826.html

In fact, that piece of CSS code would be compressed into something like:

#header h2,#sidebar h3{font:2em}

…which is two characters shorter (I don’t even know how many bytes that is. 2?)

“If, however, you were to write a just-in-time server-side post-processor (and throw in a few more hyphenated words while you’re at it) that doesn’t break CSS caching, then we might have something. Consider what Shaun Inman did with CSS SSC — having a script parse my CSS file just as it’s about to get sent down the wire, so that the original CSS file I wrote doesn’t get compressed (and thus I can continue to edit it without having to do anything extra), seems like the only way this could work.”

Myself and a mate looked into writing a module for Apache about four years ago, to parse content before it was sent into mod_gzip (if it was enabled).

Before we wrote a line of code, we put in some manual leg work to check out the possible benefits. What we found was that, it just wasn’t worth the effort. At most, you’d be shaving a couple kb off an average CSS file, which is more than compensated for by mod_gzip.

Personally, I don’t like the way they’ve implemented this, because as the poster points out, there isn’t anything necessarily wrong with your code. In fact, it encourages you to add redundant code to your CSS — not a good thing in my opinion.

On the other hand, the validator doesn’t know that, somewhere in your HTML, you haven’t combined these classes…

I suppose a few bytes can be saved by cleaning the style sheet when the server or client does not support the gzip compression. However, in the light of the fact that style sheets generally are cached for subsequent page visits, I think it’ll pay of better by cleaning the HTML, using similar measures as Dave propose, such as removing unnecessary white spaces, tabs, new lines, etc.

I’d like to think I write some pretty damn good CSS, already optimised as much as humanly possible. The only benefit I could see would be to remove whitespace and my comments (which aren’t huge anyway, usually just small comments to make it easily readable).

Given that, I did a quick calculation and found that i’d get about a 3% gain in terms of filesize, no where near enough to warrant an optimisation tool. Then you have the whole problem with the optimiser removing comments that are used as a hack like:

/* skip mac ie \*/
* html #foo {
margin: 0;
}
/* end skip */

Server side compression is where it’s at, but even then I can hardly see the benefit for something that’s getting cached anyway…

Is the time you spend on the compression solution worth less than the bandwidth costs saved?

Hahaha… yeah, to clarify my “wha”… a “shrinking of what gets sent over the wire” is exactly *compression*. I guess you feel that “true” compression requires some fancy algorithm, but it doesn’t really. Your “whitespace removal” compression is a perfectly valid lossy compression. there are two many “quoted phrases” in this post.

I have written a white paper on CSS aliases while ago. Just my five cents.

CSS aliasing concept (Proposal)
Author: Daniel Duris

The point of the aliasing in CSS is to save up some redundant strings
that are used more than once in CSS file. Nowadays CSS files are sometimes
too long and it makes them incomprehensible. Also many times there are
same elements being used more than once (e.g. nested links) in CSS files.
There is need to create some shorthand for this or aliasing.

The concept of aliasing is not new and is well-known in many fields - e.g.
in databases / SQL languages for example. By using aliases, the code you write
is more comprehensible and becomes significantly shorter.

CSS aliasing concept introduces two new characters into CSS:

& means alias - serves as a pointer to an element that comes after the pointer’s name
$ means aliased element - stands instead of aliased element (functionality does not
differ from today’s use of elements)

In the case demonstrated above the CSS file is about 90 characters shorter as compared
to one using today’s CSS capabilities. For longer style sheets, this method can save
up significant amount of bytes/bandwidth, even if the style sheet is transferred only
once per web site.

Special programs could create such optimized/aliased CSS files automatically in future.

Due to the hugely complex mathematical analysis required for a CSS file larger than about 10 statements, I think your best bet for CSS optimisation is genetic algorithms. Pass it through a fantastically created fitness function and I bet a GA could produce a CSS file smaller than any CSS file even an expert human could come up with.

It seems to me that the value of stripping white-space and comments is simply not worth the trouble.

However, Doug, as you pointed out, taking advantage of the cascade can potentially result in a pretty big saving.

How about this for a program algorithm:
1. Load a CSS file
2. Apply that CSS to a set of HTML files from your site.
3. [[Somehow]] generate a new CSS file that will result in the same styles being applied to all elements in the HTML files, with the smallest possible number of rules.

The hard part, of course, is the “Somehow.” I simply have no idea at this point how that would be done. And, of course, browser-specific hacks like * html and html > body will be lost.

Also, there’s simply no way that this could be done fast enough for a JIT compile. It’d have to be done before-hand. However, you could write a CSS file, and then tell the optimizer to spider your site, and end up with a tidy file that could be uploaded and referenced by all those pages. This would programmatically reproduce the same process that a human goes through in constructing a CSS file, but without the “gruntwork” of optimizing and removing unnecessary rules.

@5 and others: I think it really would need to be done beforehand. Any idea involving Apache doing this on the fly is a waste of processing time for a file which is basically static. So in the “right” model, the server would have stored the exact same version the user sees. It simple won’t store the version the developer created.

Then again if you really want to save time and space, store the gzipped version as well so that Apache doesn’t have to gzip it on the fly. And after all that, I’m sure you would save a lot more bandwidth by gzipping everything else, than gzipping this file which is loaded once per visit to the entire site.

@27: Obviously it would have to be pretty smart to use rules like that. In the example you gave, a “div” element between the “body” and “p” with “color:white” would instantly break that optimisation, leaving it to be only useful for much more constrained rules like “body > p”.

@40: That would be an awesome addition to CSS, though I wouldn’t use it for that purpose… I would use it so that I only have to define colours once per occurrence.

How many times have you created a CSS file, and had the client say “it’s okay… but this blue needs to be slightly less blue.” Well, time to go off and change several different colours, but not _all_ occurrences of it, because that might change some things you don’t want to.

It would be much better if the top of the CSS file defined a theme, and the rest of the file referenced it.

I would say that it is a worthwhile build methodology to routinely put your CSS through an intelligent and powerful CSS compressor. Why? In my view, the best reason for doing so is because it allows you to write your CSS in any way that suits you; to write in a style where maintenance-friendliness and readability are the only criteria, to be able to write your CSS however you like, without feeling pressurised to produce unreadable or hard-to-maintain code due to the demands of byte-skimping and without needing to feel guilty about the weight and cost of the end result if you do adopt more costly “best practice” techniques.

Many web designers seem not yet to have realised the need to decouple “source files” from “object files” in the design and build process, although the concepts of “source” and “object” code are so familiar to software engineers that they nowadays go without saying. This was astonishing to me when I first looked at CSS and (X)HTML, but I suppose that I should hardly have been so surprised, given that after all many web designers don’t come from a software engineering background.

However, for those designers who are still living within the “source file” = “executable/object file” model it means that they are in a sense (if you’ll allow me a software analogy) still stuck in the era of interpreted programming languages, where the source code itself is actually what is delivered to the user.

I separate my sources out into sections which are “replaceable”, such as colors (for example), typography and text sizes, layout and so on. This allows these modules to be more easily swapped out later on.

Molly Holzschlag’s often quoted “surgical correction strategy article” is another attempt at a suggestion for process improvement, which in my view is very misguided in its implementation, because of the “source=object” error I mentioned earlier. But that’s not the point. Molly is thinking about the problem of managing code properly, and her solutions and the source-code techniques discussed in the stopdesign articles <strong>cost</strong>. But that huge cost only exists if you are in the “source=”object” model.