Beautiful HTML is the foundation of a beautiful website. When I teach people about CSS1, I always begin by telling them that good CSS can only exist with equally good HTML markup. A house is only as strong as its foundation, right? The advantages of clean, semantic HTML are many, yet so many websites suffer from poorly written markup.

Let’s take a look at some poorly written HTML, discuss its problems, and then whip it into shape! Bear in mind, we are not passing any judgment on the content or design of this page, only the markup that builds it. If you are interested, take a peek at the bad code2 and the good code3 before we start so you can see the big picture. Now let’s start right at the top. [Content Care Oct/31/2016]

If we are going to do this, let’s just do it right. No need for a discussion about whether to use HTML 4.01 or XHTML 1.0: both of them offer a strict version that will keep us nice and honest as we write our code.

Our code doesn’t use any tables for layout anyway (nice!), so there really is no need for a transitional DOCTYPE.

In our <head> section, the very first thing should be the declaration of our character set. We’re using UTF-8 here, which is swell, but it’s listed after our <title>. Let’s go ahead and move it up so that the browser knows what character set it’s dealing with before it starts reading any content at all.

While we’re talking about characters, let’s go ahead and make sure any funny characters we are using are properly encoded. We have an ampersand in our title. To avoid any possible misinterpretation of that, we’ll convert it to &amp; instead.

All right, we are about three lines in and I’m already annoyed by the lack of indentation. Indentation has no bearing on how the page is rendered, but it has a huge effect on the readability of the code. Standard procedure is to indent one tab (or a few spaces) when you are starting a new element that is a child element of the tag above it. Then move back in a tab when you are closing that element.

Indentation rules are far from set in stone; feel free to invent your own system. But I recommend being consistent with whatever you choose. Nicely indented markup goes a long way in beautifying your code and making it easy to read and jump around in.

We have some CSS that has snuck into our <head> section. This is a grievous foul because not only does it muddy our markup but it can only apply to this single HTML page. Keeping your CSS files separate means that future pages can link to them and use the same code, so changing the design on multiple pages becomes easy.

This may have happened as a “quick fix” at some point, which is understandable and happens to all of us, but let’s get that moved to a more appropriate place in an external file. There is no JavaScript in our head section, but the same goes for that.

The title of our site, “My Cat Site,” is properly inside <h1> tags, which makes perfect sense. And as per the norm, the title is a link to the home page. However, the anchor link, <a>, wraps the header tags. Easy mistake. Most browsers will handle it fine, but technically it’s a no-no. An anchor link is an “inline” element, and header tags are “block” elements. Blocks shouldn’t go inside inline elements. It’s like crossing the streams in Ghostbusters. It’s just not a good idea.

I don’t know who first coined it, but I love the term “divitis,” which refers to the overuse of divs in HTML markup. Sometime during the learning stages of Web design, people learn how to wrap elements in a div so that they can target them with CSS and apply styling. This leads to a proliferation of the div element and to their being used far too liberally in unnecessary places.

In our example, we have a div (“topNav”) that wraps an unordered list (“bigBarNavigation”). Both divs and unordered lists are block-level elements. There really is no need whatsoever for the <ul> to be wrapped in a div; anything you can do with that div you can do with the <ul>.

This is a good time to bring up naming conventions, as we have just talked about that unordered list with an id of “bigBarNavigation.” The “Navigation” part makes sense, because it’s describing the content that the list contains, but “big” and “Bar” describe the design not the content. It might make sense right now because the menu is a big bar, but if you change the design of the website (and, say, change the website navigation to a vertical style), that id name will be confusing and irrelevant.

Good class and id names are things like “mainNav,” “subNav,” “sidebar,” “footer,” “metaData,” things that describe the content they contain. Bad class and id names are things that describe the design, like “bigBoldHeader,” “leftSidebar,” and “roundedBox.”

The design of our menu calls for all-caps text. We just dig how that looks, and more power to us. We have achieved this by putting the text in all-caps, which of course works; but for better, future extensibility, we should abstract typographic choices like this one to the CSS. We can easily target this text and turn it to all-caps with {text-transform: uppercase}. This means that down the road, if that all-caps look loses its charm, we can make one little change in the CSS to change it to regular lowercase text.

By “class the body,” I literally mean apply a class to the body, like <body class=”blogLayout”>. Why? I can see two things going on in this code that lead me to believe that this page has a layout that is different than other pages on the website. One, the main content div is id’d “mainContent-500.” Seems like someone had a “mainContent” div at one point and then needed to alter its size later on and, to do so, created a brand new id. I’m guessing it was to make it larger, because further in the code we see <class=”narrowSidebar”> applied to the sidebar, and we can infer that that was added to slim down the sidebar from its normal size.

This isn’t a very good long-term solution for alternate layouts. Instead, we should apply a class name to the body as suggested above. That will allow us to target both the “mainContent” and “sidebar” divs uniquely without the need for fancy new names or class additions.

Having unique class and id names for the body is very powerful and has many more uses than just for alternate layouts. Because every bit of page content lies in the “body” tag, you can uniquely target anything on any page with that hook; particularly useful for things like navigation and indicating current navigation, since you’ll know exactly what page you are on with that unique body class.

Kind of goes without saying, but you should run your code through the ol’ validator machine to pick up small mistakes. Sometimes the mistakes will have no bearing on how the page renders, but some mistakes certainly will. Validated code is certain to outlive non-validated code.

If for no other reason, seeing that green text on the W3C validator tool just makes you feel good inside.

If it is at all possible, keeping the sections of your website in a logical order is best. Notice how the footer section lies above the sidebar in our code. This may be because it works best for the design of the website to keep that information just after the main content and out of the sidebar. Understandable, but if there is any way to get that footer markup down to be the last thing on the page and then use some kind of layout or positioning technique to visually put it where you need it, that is better.

We’ve covered a lot here, and this is a great start to writing cleaner HTML, but there is so much more. When starting from scratch, all of this seems much easier. When trying to fix existing code, it feels a lot more difficult. You can get bogged down in a CMS that is forcing bad markup on you. Or, there could be just so many pages on your website that it’s hard to think of even where to begin. That’s OK! The important thing is that you learn how to write good HTML and that you stick with it.

The next time you write HTML, be it a small chunk in a huge website, or the beginning of a fresh new project, just do what you can to make it right.

Rafyta

kocsmy

Parker

I’ve given up on the Strict DOCTYPE. I used to be its biggest advocate, but it has become less and less viable to me. The reason: user-generated content. We’re well into Web 2.0 now, whatever that means. It’s the era of interaction. I can control my code and validate it and make it semantically perfect… but the minute a user types an ampersand into a comment box, the whole thing explodes. Not to mention mismatched, unclosed, or deprecated HTML tags.

I suppose if I got paid more I could bother to parse everything and convert characters to entities and figure out what to do with user-supplied HTML. That’s too big of a job to be feasible for me — I can’t anticipate every possible thing a user might type. With the Strict DOCTYPE, all it takes is one syntax error and the entire page fails most ungracefully. I can’t afford to let that happen on a modern dynamic site. Better to go with a Transitional DOCTYPE and let quirks mode deal with irregularities. Practicality wins out.

Andrea

Raanan Avidor

Close, but no cigar.
It is a right step in the right direction, but… (there is always a but)…
1) Since there should be only one <h1> in a page it does not need an id and if you need to style the <h1> in a different way in different pages you can use the body id to identify the <h1> with body#blogLayout h1{color:#f00}.
2) in most designs the <div id="topNav"> is needless and the <ul> under it can be the hook for the CSS, on the other hand, if you do need the <div id="topNav"> then you probably do not need the id in <ul id="bigBarNavigation">
3) the name “topNav”, “bigBarNavigation”, “sidebar” and “footer” are problematic. The id of an element should describe its content not its position (top, side, foot) or style (big). Navigation is a good name, you can add mainNavigation, secondaryNavigation and copy as ids, now if you will change (with CSS) the position and/or style of the element the id will still describe its content.

Albert

John Faulds

With the Strict DOCTYPE, all it takes is one syntax error and the entire page fails most ungracefully.

Only if you’re using an XHTML strict doctype and serving your page up as proper XHTML (mime type = application/xml+xhtml) with its more draconian error handling. And if you’re doing that then you’d need to be using content negotiation for IE anyway. XHTML served as text/html won’t fail badly and neither will HTML 4.0 pages. The pages might not validate, but they won’t fall apart. Sounds like you’re making things more difficult for yourself than they need to be.

Since there should be only one <h1> in a page it does not need an id

Not necessarily. You might use a h1 for the title of your site on the site’s home page but on internal pages use the h1 as the page title with the site’s title then being wrapped in a different tag. Rather than have two style rules for each different element, if you give the site title an ID, you only need to write the rule once. It also means that on internal pages that if you use a <p> or <div> for the site title that it won’t get confused with other similar elements in the header.

0

18

ashorlivs

James

Derek McDonald

I love clean code and whenever I develop anything I always make sure the code is readable and well commented. I find that it helps to have clean code to easily fix problems that may occur when the code is rendered.

Great list of helpful tips there but don’t forget people that both javascript and css also need to be nice and clean too.

Today, too many websites are still inaccessible. In our new book Inclusive Design Patterns, we explore how to craft flexible front-end design patterns and make future-proof and accessible interfaces without extra effort. Hardcover, 312 pages. Get the book →

Meet the new Sketch Handbook, our brand new Smashing book that will help you master all the tricky, advanced facets of Sketch. Filled with practical examples and tutorials in 12 chapters, the book will help you become more proficient in your work. Get the book.

Meet SmashingConf San Francisco 2017, featuring front-end ingredients, UX recipes and nothing but practical beats from the hidden corners of the web. Only practical, real-life techniques and recipes you can learn from. Get your ticket now!