The SitePoint Forums have moved.

You can now find them here.
This forum is now closed to new posts, but you can browse existing content.
You can find out more information about the move and how to open a new account (if necessary) here.
If you get stuck you can get support by emailing forums@sitepoint.com

If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Best Web Page Structure Convention?

Whats the best Web page structure convention that should be used when created and publishing professional Web pages?

I've created three HTML files on my PC with different formatting structures and had a look at the file sizes and I figured it's a pretty important thing to take into consideration when creating Web pages. So, care to vote please? And also, give me your views, opinions and your knowledge too please . Don't forget your brains, I want them too hehe.

Compressing the contents of the HTML page before sending it to the requesting agents makes whitespace a none issue. All my tabs for example are four spaces consecutive. When you compress that down with gzip those spaces are turned into like 10 spaces total for the whole document.

If you compress the HTML it makes the source difficult to read, which I think is a bit anti-social.

I don't mean that kind of compressing where you put everything on a single line. But using something like gzip or deflate which is compressed before sending it down the pipeline to be uncompressed to original form on the client's end. In other words the formmating of the HTML document is preserved.

Hmmm. I don't get what you mean by compressing the HTML document. I know how to compress a file or a folder and ZIP it, yes. But...If you compressed a .html file and uploaded it via your file manager or FTP then would it still open it as an ordinary HTML Web page rather than asking you to save the .html.zip or something?

See, I want to try and -make up- a nice Web page structure convention style that will cost me as little file size as possible but at the same time, the source being easily readable and understadable with the necessary indents to show nesting.

What you need is someone using one of those 28.8 kbps modems to answer your poll. While indeed a very long document with whitespace will have that add up, I still don't see it getting larger than an average image in size.

I write in style like two.html, because I need my HTML readable even after it's online. If you can do compression on a server copy while keeping the human-readable version on your development machine then it can be a non-issue (unless you need someone to look at an online site and figure out a problem).

...If you compressed a .html file and uploaded it via your file manager or FTP then would it still open it as an ordinary HTML Web page rather than asking you to save the .html.zip or something?

That is just it, you don't upload a compressed version! You upload the original and raw version. The web server (Apache or IIS, etc) will actually be the one responsible for handling the compression.

It works like this, an agent requests a page from your server along with the request it tells the server the client supports gzip. The web server then pulls the requested page but just before it sends it down the pipeline it will compress it using gzip. Once the client receives the request it then will uncompress the page automatically. It is fully transparent.

Logic without the fatal effects.
All code snippets are licensed under WTFPL.

What you need is someone using one of those 28.8 kbps modems to answer your poll. While indeed a very long document with whitespace will have that add up, I still don't see it getting larger than an average image in size.

Heh. I didn't think It was a big deal myself because it's not like all the tab space and line-breaks are going to add a whole 1 or 2KB.

I write in style like two.html, because I need my HTML readable even after it's online.

I like that too. I want it to be a fast loading Web page, but if anyone wants to view the source, I want it to be easily readable and indented properly where nesting occurs .

logic_earth if you are talking about the following:

Apache 1.3 uses mod_gzip while Apache 2.x uses mod_deflate.

From the Yahoo! Developer page on the performance rules with GZIP then I think I understand what you're talking about. That's a pretty damned nifty feature. Yet again, Apache rocks my socks! Haha.

I don't know whether I should actually be naming / calling this conventions or standards though now. Is there an actual Web page structure standard out there already that the majority uses or is it a mixture ontop of a base convention?

Assuming some degree of sanity within your scheme, the bandwidth issue is trivial. More important is human readability.

My own preference looks more like your last, the gzip/gunzip version. I also configure my editor to convert tabs to spaces (I use two spaces rather than four) on save, which makes it interoperable with differently configured and lesser (i.e. not Emacs ) editors. Nested elements are indented, and sibling elements are separated by a newline. I also put element attributes one to a line.

I don't mean that kind of compressing where you put everything on a single line. But using something like gzip or deflate which is compressed before sending it down the pipeline to be uncompressed to original form on the client's end. In other words the formmating of the HTML document is preserved.

I should have worked that one out by myself!

Andrew, I suppose there is a convention with indentations in that the child is indented with respect to the parent, but the number of spaces/tabs is a matter of, perhaps rather pathetically, sometimes quite hot debate. I don't think there's a convention regarding blank lines between elements, but I find that rather irritating unless it's between largish collections of elements. As long as it's legible and well-structured without lots of horrible junk (like sudden huge swathes of embedded JavaScript), that's what matters.

People think their opinion is the best and want to impose it on others. This normally goes hand in hand when people are discussing which text editor is the best.

I would still go with two.html, just because I don't think the indentation of BODY and HEAD are necessary and it's good to try to keep things from getting ridiculously indented (to the point where horizontal scrolling - aaagh! - is necessary). This is also why I prefer two spaces to four for indentations, in any language.

People think their opinion is the best and want to impose it on others. This normally goes hand in hand when people are discussing which text editor is the best.

So this isn't a big deal with readability then? I mean other Web Devs who check the source out on their system / in their text editor aren't going to freak out over two spaces are they?

I would still go with two.html, just because I don't think the indentation of BODY and HEAD are necessary and it's good to try to keep things from getting ridiculously indented (to the point where horizontal scrolling - aaagh! - is necessary).

I agree. I don't want crazy indentation all over the place to the point where horizontal scrolling occurs. I hate that. However, for consistency among the source, should we not indent the <head></head> and <body></body as well?

I really feel there should be some sort of Web page structure standard or something along those lines, for the sake of consistency. Anyone's thoughts on this?

This is also why I prefer two spaces to four for indentations, in any language.

I do as well, except I'm finding that in Javascript, I can't see two spaces well enough. When there's an if statement starting, I've found I need or 4 spaces for the "stuff happening" inside to be obviously indented compared to the word "if". I dunno why. But I also don't use a tab for spacing. Mostly because in gEdit, if I change the tab spacing, all my old 8-space tabs (in older pages, where the child really is 8 spots indented) get turned to 2 spaces as well, which screws up nesting.

So this isn't a big deal with readability then? I mean other Web Devs who check the source out on their system / in their text editor aren't going to freak out over two spaces are they?

It's actually a religious thing. You know how that goes.

Many web devs can take someone else's code in have it indented the way they likeó I see Paul O'B uses Dreamweaver to do this for him, when he's working on others' code.

I agree. I don't want crazy indentation all over the place to the point where horizontal scrolling occurs. I hate that. However, for consistency among the source, should we not indent the <head></head> and <body></body as well?

Well, if your personal rule is, two-spaces-per-child-level, then yeah you should, but I don't either. I think of the body and html tags as thuper-thpecial and they fall outside the rules for me.