WTF Are Canonical URLs?

A canonical URL is an exact, specific version of a website’s URL which is picked and deemed to be the authoritative one, so other slightly different versions can be regarded as equivalent.

God, how confusing. I learned about URLs a long time ago. A URL is a “Uniform Resource Locator,” a string of characters that translates to an IP address somewhere on the web where a webpage, or other thing lives. I got that. So now you’re telling me there’s different kinds of URLs?

So this is not a difficult one actually. Just imagine you’re ‘canonizing’ one version of your URL as the holy one… the one you’d want written in the Great Book of All Websites, exactly as you’d want it spelled.

Canonical URLs come straight from Google-land, and the endeavor of indexing of this sprawling mass of web content we have. It’s another one of those many web technology things that’s actually less about how computers work and more about how humans organize their communications systems. More Dewey Decimal system than Turing Test.

Imagine you’re a search engine, and painfully literal-minded like any software. How do you know that joescoffee.com is the same as www.joescoffee.com and joescoffee.com/index.html? We know those are all the same webpage, and that those differences are ironed out automatically by the magic of the Internet. But when developing a website, there are some cases where you’ll want or need to specify the exact version you want to get used - like because Google should know.

And that’s when the canonical URL comes in. Want to use a www everywhere? Are you keeping it old-school with index.html? Or like to keep things streamlined? It’s up to you.

Forwardslash is a blog about creative technology, programming, the Internet, and digital aesthetics. It's a developer's notebook meets general-interest blog on technology's overlap with other topics. It's written by Justin Allen, a guy who works on websites for a living and has been writing stories for even longer than he's been writing HTML.