Thursday, February 13, 2014

As some of my vast readership knows I moderate over at Sadly, No!. About once a week somebody has trouble with linking or otherwise bolding/italicizing comments so I wrote this up as a comment over there. It was long enough for a post of its own (slightly tweaked for context) for posterity, so, here you go:

Here's what all that angle bracket shit you're using for linking/bold/italic/etc. means.

This is about HTML. Now I'm sure most of you have heard those letters before, and many will know what they mean, this isn't for you. Stop reading.

Really. Stop. Fine, it's your brain.

HTML stands for "HyperText Markup Language" which itself used SGML, the "Standard Generalized Markup Language" to define its concrete syntax.

Basically what that means is that's where the angle brackets (<>) that we use to mark the text for special consideration come from. Beyond that SGML is mostly just historical at this point and you should read the wackypedia if you care.

The markup that FYWP (and Google/blogspot, also, too, mostly) allows us to use in comments is a very small subset of HTML -- quoting, italics, bold, strikeout, etc. All of the markup we use has its corresponding tag. A "tag" is what marks the beginning and ending of "special" text -- stuff we want to make bold or turn into a link. Tags start with a left angle bracket (<) a.k.a. "less than" and are immediately followed by one or more characters. In the case of the link tag, it's "a" as in "anchor." That's what it was called back in the day as its main use was to "anchor" one part of a document to another so you could jump back-n-forth glossary/index style. Nobody gives a shit about that anymore.

After that opening bit some tags can be immediately closed with a closing right angle bracket (>) a.k.a. "greater than" (e.g. <i>) -- other tags (e.g. <a ) should be followed by one or more attributes. An attribute is just that, more stuff that describes the tag, and so affects the human-readable text inside the tag's opening and closing markers.

Links can have a few different attributes, but the one we really care about is "href."

The "href" attribute specifies the "hyperlink reference" that the anchor points to -- just a fancy name for what we know as a link, a.k.a. "URL" -- "uniform resource locator" (or "URI" -- "uniform resource identifier" more currently correct).

So to put that all together, we add a space after our "<a" then add our attribute and its contents. The attribute name must be followed by an equals sign (=), that's the separator that tells the parser that this attribute contains a value and isn't just an on/off flag, and that should be followed by a quote character (single or double, doesn't matter) to delineate the actual start of the attribute's contents, the URL.

A URL can be a very complicated thing, so I'll just describe the most basic use, pointing to a simple file/resource on a web server. Such a link needs to start with "http://" because that tells us we want the "HypterText Transport Protocol" and not something else, like FTP ("File Transfer Protocol") or something. There are other protocols, but you don't care about those (or any of this, really. Why are you still reading?) After the dual slashes (again used as delimiters and also required -- just one doesn't cut it) we have the name of the internet server (domain name) e.g. "www.google.com". Yes, of course you could use the server's IP address instead of its domain name, why couldn't you? Do you really want me to get into IPV4 notation too? IPV6? You monster!

Ahem.

The name (yes, or address) is followed by another delimiting slash. And then you get to the actual stuff you want to look at on that particular server -- the path that points to the file and/or actual resource you're looking for, (e.g. "/the/internet/is/a/silly/place.html"). There are lots and lots of things that can be contained in that string, but since most of you wankers are just copying and pasting them from somewhere, you don't care.

Anyhoo, the URL is then followed by the closing quote character -- of course it has to match the opening character -- symmetry! -- and the closing right angle bracket >

All that just gets us our opening tag. See, it's easy when you don't think about it!

To recap, what we have so far:

<a href="http://a.server.name/the/internet/is/a/silly/place.html">

After the opening tag, you put whatever you want to actually be displayed to be clicked on, typically:

PENIS

That's the "contents" of the tag. The human-readable shit that we're meant to see. Get it? We have like, these magic pairs of less-than/greater-than doohickys with things in them that tell the computer what to do, and we put human readable shit in-between pairs of these other pairs. We good so far? No? Tough, you shouldn't even be reading this anyway.

Now... wait for it... we need to close the tag. Guess what that is? Yep more brackets and slashes and shit. Closing tags don't get attributes, so at least we have that going for us. In our example, it's simple -- we need to start the tag with our left angle bracket (<), then tell the parser that this is going to be a closing tag by putting yet another slash (/) in there. Then that needs to be followed by the corresponding type of tag we're closing -- it's an "a" for "anchor" remember? -- and then we close the closing tag (yes, what of it?) with that final, delightful, right angle bracket (>). That combination should look like this if you've forgotten already: </a>