Transcript

1.
DIVE INTO HTML5 BY MARK PILGRIM WITH ILLUSTRATIONS FROM THE PUBLIC DOMAIN ❧ ive Into HTML5 seeks to elaborate on a hand-pied Selection of features from the HTML5 speciﬁcation and other ﬁne Standards. e ﬁnal manuscript has been published on paper by O’Reilly, under the Google Press imprint. Buy the printed Work — artfully titled “HTML5: Up & Running” — and be the ﬁrst inyour Community to receive it. Your kind and sincere Feedba is always welcome. e Workshall remain online under the CC-BY-3.0 License.diveintohtml5.org DIVE INTO HTML5

2.
TABLE OF CONTENTSIntroduction: Five ings You Should Know About HTML5 0A ite Biased History of HTML5 1Detecting HTML5 Features: It’s Elementary, My Dear Watson 2What Does It All Mean? 3Let’s Call It a Draw(ing Surface) 4Video in a Flash (Without at Other ing) 5You Are Here (And So Is Everybody Else) 6A Place To Put Your Stuﬀ 7Let’s Take is Oﬄine 8A Form of Madness 9“Distributed,” “Extensibility,” And Other Fancy Words 10e All-In-One Almost-Alphabetical No-Bullshit Guide to Detecting Everything 11HTML5 Peeks, Pokes and Pointers 12 ❧ “If you’re good at something, never do it for free.” —e Joker (but that doesn’t mean you should keep it to yourself) Copyright MMIX–MMX Mark Pilgrimdiveintohtml5.org DIVE INTO HTML5

4.
You are here: Home ‣ Dive Into HTML5 ‣ TABLE OF CONTENTSIntroduction: Five ings You Should Know About HTML5 1. It’s not one big thing i 2. You don’t need to throw anything away ii 3. It’s easy to get started iii 4. It already works iv 5. It’s here to stay vA ite Biased History of HTML5 Diving In i MIME types ii A long digression into how standards are made iii An unbroken line iv A timeline of HTML development from 1997 to 2004 v Everything you know about XHTML is wrong vi A competing vision vii WHAT Working Group? viii Ba to the W3C ix Postscript xdiveintohtml5.org TABLE OF CONTENTS

10.
e All-In-One Almost-Alphabetical No-Bullshit Guide to Detecting Everything Further Reading iHTML5 Peeks, Pokes and Pointers ❧ “If you’re good at something, never do it for free.” —e Joker (but that doesn’t mean you should keep it to yourself) Copyright MMIX–MMX Mark Pilgrim powered by Google™ Searchdiveintohtml5.org TABLE OF CONTENTS

11.
You are here: Home ‣ Dive Into HTML5 ‣ INTRODUCTION: FIVE THINGS YOU SHOULD KNOW ABOUT HTML5 show table of contents ❧1. It’s not one big thingYou may well ask: “How can I start using HTML5 if olderbrowsers don’t support it?” But the question itself ismisleading. HTML5 is not one big thing; it is a collectionof individual features. So you can’t detect “HTML5support,” because that doesn’t make any sense. But you candetect support for individual features, like canvas, video, orgeolocation.You may think of HTML as tags and angle braets. at’s an important part of it, but it’snot the whole story. e HTML5 speciﬁcation also deﬁnes how those angle braets interactwith JavaScript, through the Document Object Model (DOM). HTML5 doesn’t just deﬁne a<video> tag; there is also a corresponding DOM API for video objects in the DOM. You canuse this API to detect support for diﬀerent video formats, play a video, pause, mute audio,tra how mu of the video has been downloaded, and everything else you need to build ari user experience around the <video> tag itself.diveintohtml5.org FIVE THINGS YOU SHOULD KNOW ABOUT HTML5

12.
Chapter 2 and Appendix A will tea you how to properly detect support for ea newHTML5 feature.2. You don’t need to throw anythingaway Love it or hate it, you can’t deny that HTML 4 is the most successful markup format ever. HTML5 builds on that success. You don’t need to throw away your existing markup. You don’t need to relearn things you already know. If your web application worked yesterday in HTML 4, it will still work today in HTML5. Period. Now, if you want to improve your web applications, you’ve come to the right place. Here’s a concrete example: HTML5 supports all the form controls from HTML 4, but it also includes new input controls.Some of these are long-overdue additions like sliders and date piers; others are more subtle.For example, the email input type looks just like a text box, but mobile browsers willcustomize their onscreen keyboard to make it easier to type email addresses. Older browsersthat don’t support the email input type will treat it as a regular text ﬁeld, and the form stillworks with no markup anges or scripting has. is means you can start improving yourweb forms today, even if some of your visitors are stu on IE 6.Read all the gory details about HTML5 forms in Chapter 9.3. It’s easy to get started“Upgrading” to HTML5 can be as simple as anging yourdoctype. e doctype should already be on the ﬁrst line ofevery HTML page. Previous versions of HTML deﬁned alot of doctypes, and oosing the right one could be triy.diveintohtml5.org FIVE THINGS YOU SHOULD KNOW ABOUT HTML5

13.
In HTML5, there is only one doctype: <!DOCTYPE html>Upgrading to the HTML5 doctype won’t break yourexisting markup, because all the tags deﬁned in HTML 4 are still supported in HTML5. But itwill allow you to use — and validate — new semantic elements like <article>,<section> , <header>, and <footer>. You’ll learn all about these new elements inChapter 3.4. It already works Whether you want to draw on a canvas, play video, design beer forms, or build web applications that work oﬄine, you’ll ﬁnd that HTML5 is already well-supported. Firefox, Safari, Chrome, Opera, and mobile browsers already support canvas (Chapter 4), video (Chapter 5), geolocation (Chapter 6), local storage (Chapter 7), and more. Google already supports microdata annotations (Chapter 10). Even Microso — rarely known for blazing the trail of standards support — will be supporting most HTML5 features in theupcoming Internet Explorer 9.Ea apter of this book includes the all-too-familiar browsercompatibility arts. But more importantly, ea apter includes a frankdiscussion of your options if you need to support older browsers.HTML5 features like geolocation (Chapter 6) and video (Chapter 5) wereﬁrst provided by browser plugins like Gears or Flash. Other features,like canvas (Chapter 4), can be emulated entirely in JavaScript. is book will tea you howto target the native features of modern browsers, without leaving older browsers behind.5. It’s here to staydiveintohtml5.org FIVE THINGS YOU SHOULD KNOW ABOUT HTML5

14.
Tim Berners-Lee invented the world wide web in the early 1990s. He later founded the W3Cto act as a steward of web standards, whi the organization has done for more than 15 years.Here is what the W3C had to say about the future of web standards, in July 2009: Today the Director announces that when the XHTML 2 Working Group arter expires as seduled at the end of 2009, the arter will not be renewed. By doing so, and by increasing resources in the HTML Working Group, W3C hopes to accelerate the progress of HTML5 and clarify W3C’s position regarding the future of HTML.HTML5 is here to stay. Let’s dive in. ❧DID YOU KNOW? In association with Google Press, O’Reilly is distributing this book in a variety of formats, including paper, ePub, Mobi, and DRM-free PDF. e paid edition is called “HTML5: Up & Running,” and it is available now. If you liked this introduction and want to show your appreciation, you can buy “HTML5: Up & Running” with this aﬃliate link or buy an electronic edition directly from O’Reilly. You’ll get a book, and I’ll get a bu. I do not currently accept direct donations. Copyright MMIX–MMX Mark Pilgrim powered by Google™ Searchdiveintohtml5.org FIVE THINGS YOU SHOULD KNOW ABOUT HTML5

16.
You are here: Home ‣ Dive Into HTML5 ‣ №1 . HOW DID WE GET HERE? show table of contents ❧ DIVING IN ecently, I stumbled across a quote from a Mozilla developer about the tension inherent in creating standards: Implementations and speciﬁcations have to do a delicate dance together. You don’t want implementations to happen before the speciﬁcation is ﬁnished, because people start depending on the details of implementations and that constrains the speciﬁcation. However, you also don’t want the speciﬁcation to be ﬁnished before there are implementations and author experience with those implementations, because you need the feedba. ere is unavoidable tension here, but we just have to muddle on through.Keep this quote in the ba of your mind, and let me explain how HTML5 came to be.diveintohtml5.org HOW DID WE GET HERE?

17.
❧ MIME TYPESis book is about HTML5, not previous versions of HTML, and not any version of XHTML.But to understand the history of HTML5 and the motivations behind it, you need tounderstand a few tenical details ﬁrst. Speciﬁcally, MIME types.Every time your web browser requests a page, the web server sends “headers” before it sendsthe actual page markup. ese headers are normally invisible, although there are webdevelopment tools that will make them visible if you’re interested. But the headers areimportant, because they tell your browser how to interpret the page markup that follows. emost important header is called Content-Type, and it looks like this: Content-Type: text/html“text/html” is called the “content type” or “ MIME type” of the page. is header is the onlything that determines what a particular resource truly is, and therefore how it should berendered. Images have their own MIME types (image/jpeg for JPEG images, image/pngfor PNG images, and so on). JavaScript ﬁles have their own MIME type. CSS stylesheets havetheir own MIME type. Everything has its own MIME type. e web runs on MIME types.Of course, reality is more complicated than that. e ﬁrst generation of web servers (and I’mtalking web servers from 1993) didn’t send the Content-Type header because it didn’t existyet. (It wasn’t invented until 1994.) For compatibility reasons that date all the way ba to1993, some popular web browsers will ignore the Content-Type header under certaincircumstances. (is is called “content sniﬃng.”) But as a general rule of thumb, everythingyou’ve ever looked at on the web — HTML pages, images, scripts, videos, PDFs, anythingdiveintohtml5.org HOW DID WE GET HERE?

18.
with a URL — has been served to you with a speciﬁc MIME type in the Content-Typeheader.Tu that under your hat. We’ll come ba to it. ❧ A LONG DIGRESSION INTO HOW STANDARDS ARE MADEWhy do we have an <img> element?at’s not a question you hear everyday. Obviously someone must havecreated it. ese things don’t just appearout of nowhere. Every element, everyaribute, every feature of HTML thatyou’ve ever used — someone createdthem, decided how they should work,and wrote it all down. ese people arenot gods, nor are they ﬂawless. ey’rejust people. Smart people, to be sure. Butjust people.One of the great things about standardsthat are developed “out in the open” isthat you can go ba in time and answerthese kinds of questions. Discussions occur on mailing lists, whi are usually arived andpublicly searable. So I decided to do a bit of “email araeology” to try to answer thequestion, “Why do we have an <img> element?” I had to go ba to before there was anorganization called the World Wide Web Consortium (W3C). I went ba to the earliest daysof the web, when you could count the number of web servers with both hands and maybe acouple of toes.diveintohtml5.org HOW DID WE GET HERE?

19.
(ere are a number of typographical errors in the following quotes. I have decided to leavethem intact for historical accuracy.)On February 25, 1993, Marc Andreessen wrote: I’d like to propose a new, optional HTML tag: IMG Required argument is SRC="url". is names a bitmap or pixmap ﬁle for the browser to aempt to pull over the network and interpret as an image, to be embedded in the text at the point of the tag’s occurrence. An example is: <IMG SRC="file://foobar.com/foo/bar/blargh.xbm"> (ere is no closing tag; this is just a standalone tag.) is tag can be embedded in an anor like anything else; when that happens, it becomes an icon that’s sensitive to activation just like a regular text anor. Browsers should be aﬀorded ﬂexibility as to whi image formats they support. Xbm and Xpm are good ones to support, for example. If a browser cannot interpret a given format, it can do whatever it wants instead (X Mosaic will pop up a default bitmap as a placeholder). is is required functionality for X Mosaic; we have this working, and we’ll at least be using it internally. I’m certainly open to suggestions as to how this should be handled within HTML; if you have a beer idea than what I’m presenting now, please let me know. I know this is hazy wrt image format, but I don’t see an alternative than to just say “let the browser do what it can” and wait for the perfect solution to come along (MIME, someday, maybe).diveintohtml5.org HOW DID WE GET HERE?

20.
Xbm and Xpm were popular graphics formats on Unix systems.“Mosaic” was one of the earliest web browsers. (“X Mosaic” was the version that ran on Unixsystems.) When he wrote this message in early 1993, Marc Andreessen had not yet foundedthe company that made him famous, Mosaic Communications Corporation, nor had he startedwork on that company’s ﬂagship product, “Mosaic Netscape.” (You may know them beer bytheir later names, “Netscape Corporation” and “Netscape Navigator.”)“MIME, someday, maybe” is a reference to content negotiation, a feature of HTTP where aclient (like a web browser) tells the server (like a web server) what types of resources itsupports (like image/jpeg) so the server can return something in the client’s preferredformat. e Original HTTP as deﬁned in 1991 (the only version that was implemented inFebruary 1993) did not have a way for clients to tell servers what kinds of images theysupported, thus the design dilemma that Marc faced.A few hours later, Tony Johnson replied: I have something very similar in Midas 2.0 (in use here at SLAC, and due for public release any week now), except that all the names are diﬀerent, and it has an extra argument NAME="name". It has almost exactly the same functionality as your proposed IMG tag. e.g. <ICON name="NoEntry" href="http://note/foo/bar/NoEntry.xbm"> e idea of the name parameter was to allow the browser to have a set of “built in” images. If the name mates a “built in” image it would use that instead of having to go out and fet the image. e name could also act as a hint for “line mode” browsers as to what kind of a symbol to put in place of the image. I don’t mu care about the parameter or tag names, but it would be sensible if we used the same things. I don’t mu care for abbreviations, ie why not IMAGE= and SOURCE=. I somewhat prefer ICON since it imlies that the IMAGE should be smallish, but maybe ICON is an overloaded word?Midas was another early web browser, a contemporary of X Mosaic. It was cross-platform; itran on both Unix and VMS. “SLAC” refers to the Stanford Linear Accelerator Center , now thediveintohtml5.org HOW DID WE GET HERE?

21.
SLAC National Accelerator Laboratory, that hosted the ﬁrst web server in the United States (infact the ﬁrst web server outside Europe ). When Tony wrote this message, SLAC was an old-timer on the WWW, having hosted ﬁve pages on its web server for a whopping 441 days.Tony continued: While we are on the subject of new tags, I have another, somewhat similar tag, whi I would like to support in Midas 2.0. In principle it is: <INCLUDE HREF="..."> e intention here would be that the second document is to be included into the ﬁrst document at the place where the tag occured. In principle the referenced document could be anything, but the main purpose was to allow images (in this case arbitrary sized) to be embedded into documents. Again the intention would be that when HTTP2 comes along the format of the included document would be up for separate negotiation.“HTTP2” is a reference to Basic HTTP as deﬁned in 1992 . At this point, in early 1993, it wasstill largely unimplemented. e dra known as “HTTP2” evolved and was eventuallystandardized as “HTTP 1.0” (albeit not for another three years ). HTTP 1.0 did include requestheaders for content negotiation, a.k.a. “MIME, someday, maybe.”Tony continued: An alternative I was considering was: <A HREF="..." INCLUDE>See photo</A> I don’t mu like adding more functionality to the <A> tag, but the idea here is to maintain compatibility with browsers that can not honour the INCLUDE parameter. e intention is that browsers whi do understand INCLUDE, replace the anor text (in this case “See photo”) with the included document (picture), while older or dumber browsers ignore the INCLUDE tag completely.is proposal was never implemented, although the idea of providing text if an image isdiveintohtml5.org HOW DID WE GET HERE?

22.
missing is an important accessibility tenique that was missing from Marc’s initial <IMG>proposal. Years later, this feature was bolted on as the <img alt> aribute, whi Netscapepromptly broke by erroneously treating it as a tooltip .A few hours aer Tony posted his message, Tim Berners-Lee responded: I had imagined that ﬁgues would be reprented as <a name=fig1 href="fghjkdfghj" REL="EMBED, PRESENT">Figure </a> where the relation ship values mean EMBED Embed this here when presenting it PRESENT Present this whenever the source document is presented Note that you can have various combinations of these, and if the browser doesn’t support either one, it doesn’t break. [I] see that using this as a method for selectable icons means nesting anors. Hmmm. But I hadn’t wanted a special tag.is proposal was never implemented, but the rel aribute is still around.Jim Davis added: It would be nice if there was a way to specify the content type, e.g. <IMG HREF="http://nsa.gov/pub/sounds/gorby.au" CONTENT- TYPE=audio/basic> But I am completely willing to live with the requirement that I specify the content type by ﬁle extension.is proposal was never implemented, but Netscape did later add support for embedding ofdiveintohtml5.org HOW DID WE GET HERE?

23.
media objects with the <embed> element.Jay C. Weber asked: While images are at the top of my list of desired medium types in a WWW browser, I don’t think we should add idiosyncratic hooks for media one at a time. Whatever happened to the enthusiasm for using the MIME typing meanism?Marc Andreessen replied: is isn’t a substitute for the upcoming use of MIME as a standard document meanism; this provides a necessary and simple implementation of functionality that’s needed independently from MIME.Jay C. Weber responded: Let’s temporarily forget about MIME, if it clouds the issue. My objection was to the discussion of “how are we going to support embedded images” rather than “how are we going to support embedded objections in various media”. Otherwise, next week someone is going to suggest ‘lets put in a new tag <AUD SRC="file://foobar.com/foo/bar/blargh.snd"> ‘ for audio. ere shouldn’t be mu cost in going with something that generalizes.With the beneﬁt of hindsight, it appears that Jay’s concerns were well founded. It took a lilemore than a week, but HTML5 did ﬁnally add new <video> and <audio> elements.Responding to Jay’s original message, Dave Ragge said: True indeed! I want to consider a whole range of possible image/line art types, along with the possibility of format negotiation. Tim’s note on supporting cliable areas within images is also important.Later in 1993, Dave Ragge proposed HTML+ as an evolution of the HTML standard. eproposal was never implemented, and it was superseded by HTML 2.0. HTML 2.0 was adiveintohtml5.org HOW DID WE GET HERE?

24.
“retro-spec,” whi means it formalized features already in common use. “is speciﬁcationbrings together, clariﬁes, and formalizes a set of features that roughly corresponds to thecapabilities of HTML in common use prior to June 1994.”Dave later wrote HTML 3.0, based on his earlier HTML+ dra. Outside of the W3C’s ownreference implementation, Arena), HTML 3.0 was never implemented, and it was supersededby HTML 3.2, another “retro-spec”: “ HTML 3.2 adds widely deployed features su as tables,applets and text ﬂow around images, while providing full bawards compatibility with theexisting standard HTML 2.0.”Dave later co-authored HTML 4.0, developed HTML Tidy , and went on to help with XHTML,XForms, MathML, and other modern W3C speciﬁcations.Geing ba to 1993, Marc replied to Dave: Actually, maybe we should think about a general-purpose procedural graphics language within whi we can embed arbitrary hyperlinks aaed to icons, images, or text, or anything. Has anyone else seen Intermedia’s capabilities wrt this?Intermedia was a hypertext project from Brown University. It was developed from 1985 to1991 and ran on A/UX, a Unix-like operating system for early Macintosh computers.e idea of a “general-purpose procedural graphics language” did eventually cat on. Modernbrowsers support both SVG (declarative markup with embedded scripting) and <canvas> (aprocedural direct-mode graphics API), although the laer started as a proprietary extensionbefore being “retro-specced” by the WHATWG.Bill Janssen replied: Other systems to look at whi have this (fairly valuable) notion are Andrew and Slate. Andrew is built with _insets_, ea of whi has some interesting type, su as text, bitmap, drawing, animation, message, spreadsheet, etc. e notion of arbitrary recursive embedding is present, so that an inset of any kind can be embedded in any other kind whi supports embedding. For example, an inset can be embedded at any point in the text of the text widget, or in any rectangular areadiveintohtml5.org HOW DID WE GET HERE?

25.
in the drawing widget, or in any cell of the spreadsheet.“Andrew” is a reference to the Andrew User Interface System (although at that time it wassimply known as the Andrew Project).Meanwhile, omas Fine had a diﬀerent idea : Here’s my opinion. e best way to do images in WWW is by using MIME. I’m sure postscript is already a supported subtype in MIME, and it deals very nicely with mixing text and graphics. But it isn’t cliable, you say? Yes your right. I suspect there is already an answer to this in display postscript. Even if there isn’t the addition to standard postscript is trivial. Deﬁne an anor command whi speciﬁes the URL and uses the current path as a closed region for the buon. Since postscript deals so well with paths, this makes arbitrary buon shapes trivial.Display Postscript was an on-screen rendering tenology co-developed by Adobe and NeXT.is proposal was never implemented, but the idea that the best way to ﬁx HTML is toreplace it with something else altogether still pops up from time to time.Tim Berners-Lee, Mar 2, 1993: HTTP2 allows a document to contain any type whi the user has said he can handle, not just registered MIME types. So one can experiment. Yes I think there is a case for postscript with hypertext. I don’t know whether display postcript has enough. I know Adobe are trying to establish their own postscript-based “PDF” whi will have links, and be readable by their proprietory brand of viewers. I thought that a generic overlaying language for anors (Hytime based?) would allow the hypertext and the graphics/video standards to evolve separately, whi would help both. Let the IMG tag be INCLUDE and let it refer to an arbitrary document type. Or EMBED if INCLUDE sounds like a cpp include whi people will expect to providediveintohtml5.org HOW DID WE GET HERE?

26.
SGML source code to be parsed inline — not what was intended.HyTime was an early, SGML-based hypertext document system. It loomed large in earlydiscussions of HTML, and later XML.Tim’s proposal for an <INCLUDE> tag was never implemented, although you can see eoesof it in <object>, <embed>, and the <iframe> element.Finally, on Mar 12, 1993, Marc Andreessen revisited the thread : Ba to the inlined image thread again — I’m geing close to releasing Mosaic v0.10, whi will support inlined GIF and XBM images/bitmaps, as mentioned previously. … We’re not prepared to support INCLUDE/EMBED at this point. … So we’re probably going to go with <IMG SRC="url"> (not ICON, since not all inlined images can be meaningfully called icons). For the time being, inlined images won’t be explicitly content-type’d; down the road, we plan to support that (along with the general adaptation of MIME). Actually, the image reading routines we’re currently using ﬁgure out the image format on the ﬂy, so the ﬁlename extension won’t even be signiﬁcant. ❧ AN UNBROKEN LINEI am extraordinarily fascinated with all aspects of this almost-17-year-old conversation that ledto the creation of an HTML element that has been used on virtually every web page everpublished. Consider:diveintohtml5.org HOW DID WE GET HERE?

27.
HTTP still exists. HTTP successfully evolved from 0.9 into 1.0 and later 1.1. And still it evolves. HTML still exists. at rudimentary data format — it didn’t even support inline images! — successfully evolved into 2.0, 3.2, 4.0. HTML is an unbroken line. A twisted, knoed, snarled line, to be sure. ere were plenty of “dead branes” in the evolutionary tree, places where standards-minded people got ahead of themselves (and ahead of authors and implementors). But still. Here we are, in 2010, and web pages from 1990 still render in modern browsers. I just loaded one up in the browser of my state-of-the-art Android mobile phone, and I didn’t even get prompted to “please wait while importing legacy format…” HTML has always been a conversation between browser makers, authors, standards wonks, and other people who just showed up and liked to talk about angle braets. Most of the successful versions of HTML have been “retro-specs,” cating up to the world while simultaneously trying to nudge it in the right direction. Anyone who tells you that HTML should be kept “pure” (presumably by ignoring browser makers, or ignoring authors, or both) is simply misinformed. HTML has never been pure, and all aempts to purify it have been spectacular failures, mated only by the aempts to replace it. None of the browsers from 1993 still exist in any recognizable form. Netscape Navigator was abandoned in 1998 and rewrien from scrat to create the Mozilla Suite, whi was then forked to create Firefox. Internet Explorer had its humble “beginnings” in “Microso Plus! for Windows 95,” where it was bundled with some desktop themes and a pinball game. (But of course that browser can be traced ba further too .) Some of the operating systems from 1993 still exist, but none of them are relevant to the modern web. Most people today who “experience” the web do so on a PC running Windows 2000 or later, a Mac running Mac OS X, a PC running some ﬂavor of Linux, or a handheld device like an iPhone. In 1993, Windows was at version 3.1 (and competing with OS/2), Macs were running System 7, and Linux was distributed via Usenet. (Want to have some fun? Find a graybeard and whisper “Trumpet Winso” or “MacPPP.”) Some of the same people are still around and still involved in what we now simply call “web standards.” at’s aer almost 20 years. And some were involved in predecessorsdiveintohtml5.org HOW DID WE GET HERE?

28.
of HTML, going ba into the 1980s and before. Speaking of predecessors… With the eventual popularity of HTML and the web, it is easy to forget the contemporary formats and systems that informed its design. Andrew? Intermedia? HyTime? And HyTime was not some rinky-dink academic resear project; it was an ISO standard. It was approved for military use. It was Big Business. And you can read about it yourself… on this HTML page, in your web browser .But none of this answers the original question: why do we have an <img> element? Why notan <icon> element? Or an <include> element? Why not a hyperlink with an includearibute, or some combination of rel values? Why an <img> element? ite simply, becauseMarc Andreessen shipped one, and shipping code wins.at’s not to say that all shipping code wins; aer all, Andrew and Intermedia and HyTimeshipped code too. Code is necessary but not suﬃcient for success. And I certainly don’t meanto say that shipping code before a standard will produce the best solution. Marc’s <img>element didn’t mandate a common graphics format; it didn’t deﬁne how text ﬂowed aroundit; it didn’t support text alternatives or fallba content for older browsers. And 17 years later,we’re still struggling with content sniﬃng , and it’s still a source of crazy securityvulnerabilities. And you can trace that all the way ba, 17 years, through the Great BrowserWars, all the way ba to February 25, 1993, when Marc Andreessen oandedly remarked,“MIME, someday, maybe,” and then shipped his code anyway.e ones that win are the ones that ship. ❧A TIMELINE OF HTML DEVELOPMENT FROM 1997 TO 2004In December 1997, the World Wide Web Consortium (W3C) published HTML 4.0 andpromptly shut down the HTML Working Group. Less than two months later, a separate W3CWorking Group published XML 1.0. A mere three months aer that, the people who ran theW3C held a workshop called “Shaping the Future of HTML” to answer the question, “Hasdiveintohtml5.org HOW DID WE GET HERE?

29.
W3C given up on HTML?” is was their answer: In discussions, it was agreed that further extending HTML 4.0 would be diﬃcult, as would converting 4.0 to be an XML application. e proposed way to break free of these restrictions is to make a fresh start with the next generation of HTML based upon a suite of XML tag-sets.e W3C re-artered the HTML Working Group to create this “suite of XML tag-sets.” eirﬁrst step, in December 1998, was a dra of an interim speciﬁcation that simply reformulatedHTML in XML without adding any new elements or aributes. is speciﬁcation later becameknown as “XHTML 1.0.” It deﬁned a new MIME type for XHTML documents,application/xhtml+xml . However, to ease the migration of existing HTML 4 pages, italso included Appendix C, that “summarizes design guidelines for authors who wish theirXHTML documents to render on existing HTML user agents.” Appendix C said you wereallowed to author so-called “XHTML” pages but still serve them with the text/html MIMEtype.eir next target was web forms. In August 1999, the same HTML Working Group publisheda ﬁrst dra of XHTML Extended Forms. ey set the expectations in the ﬁrst paragraph : Aer careful consideration, the HTML Working Group has decided that the goals for the next generation of forms are incompatible with preserving bawards compatibility with browsers designed for earlier versions of HTML. It is our objective to provide a clean new forms model (“XHTML Extended Forms”) based on a set of well-deﬁned requirements. e requirements described in this document are based on experience with a very broad spectrum of form applications.A few months later, “ XHTML Extended Forms” was renamed “XForms” and moved to its ownWorking Group. at group worked in parallel with the HTML Working Group and ﬁnallypublished the ﬁrst edition of XForms 1.0 in October 2003.Meanwhile, with the transition to XML complete, the HTML Working Group set their sightson creating “the next generation of HTML.” In May 2001, they published the ﬁrst edition ofXHTML 1.1, that added only a few minor features on top of XHTML 1.0, but also eliminatedthe “Appendix C” loophole. Starting with version 1.1, all XHTML documents were to beserved with a MIME type of application/xhtml+xml.diveintohtml5.org HOW DID WE GET HERE?

30.
❧ EVERYTHING YOU KNOW ABOUT XHTML IS WRONGWhy are MIME types important? Why do I keep coming ba to them? ree words: draconianerror handling. Browsers have always been “forgiving” with HTML. If you create an HTMLpage but forget the </head> tag, browsers will display the page anyway. (Certain tagsimplicitly trigger the end of the <head> and the start of the <body>.) You are supposed tonest tags hierarically — closing them in last-in-ﬁrst-out order — but if you create markuplike <b><i></b></i>, browsers will just deal with it (somehow) and move on withoutdisplaying an error message. As you might expect, the fact that “broken” HTML markup still worked in web browsers led authors to create broken HTML pages. A lot of broken pages. By some estimates, over 99% of HTML pages on the web today have at least one error in them. But because these errors don’t cause browsers to display visible error messages, nobody ever ﬁxes them. e W3C saw this as a fundamental problem with the web, and they set out to correct it. XML, published in 1997, broke from the tradition of forgiving clients and mandated that all programs that consumed XML must treat so-called “well-formedness” errors as fatal. is concept of failing on the ﬁrst error became known as “draconian error handling,” aer the Greek leader Draco who instituted the death penalty for relatively minor infractions of his laws. When the W3C reformulated HTML as an XML vocabulary, they mandated that all documents servedwith the new application/xhtml+xml MIME type would be subject to draconian errorhandling. If there was even a single well-formedness error in your XHTML page — su asforgeing the </head> tag or improperly nesting start and end tags — web browsers woulddiveintohtml5.org HOW DID WE GET HERE?

31.
have no oice but to stop processing and display an error message to the end user.is idea was not universally popular. With an estimated error rate of 99% on existing pages,the ever-present possibility of displaying errors to the end user, and the dearth of newfeatures in XHTML 1.0 and 1.1 to justify the cost, web authors basically ignoredapplication/xhtml+xml . But that doesn’t mean they ignored XHTML altogether. Oh,most deﬁnitely not. Appendix C of the XHTML 1.0 speciﬁcation gave the web authors of theworld a loophole: “Use something that looks kind of like XHTML syntax, but keep serving itwith the text/html MIME type.” And that’s exactly what thousands of web developers did:they “upgraded” to XHTML syntax but kept serving it with a text/html MIME type.Even today, millions of web pages claim to be XHTML. ey start with the XHTML doctypeon the ﬁrst line, use lowercase tag names, use quotes around aribute values, and add atrailing slash aer empty elements like <br /> and <hr />. But only a tiny fraction ofthese pages are served with the application/xhtml+xml MIME type that would triggerXML’s draconian error handling. Any page served with a MIME type of text/html —regardless of doctype, syntax, or coding style — will be parsed using a “forgiving” HTMLparser, silently ignoring any markup errors, and never alerting end users (or anyone else) evenif the page is tenically broken.XHTML 1.0 included this loophole, but XHTML 1.1 closed it, and the never-ﬁnalized XHTML2.0 continued the tradition of requiring draconian error handling. And that’s why there arebillions of pages that claim to be XHTML 1.0, and only a handful that claim to be XHTML1.1 (or XHTML 2.0). So are you really using XHTML? Che your MIME type. (Actually, ifyou don’t know what MIME type you’re using, I can prey mu guarantee that you’re stillusing text/html.) Unless you’re serving your pages with a MIME type ofapplication/xhtml+xml , your so-called “ XHTML” is XML in name only. ❧ A COMPETING VISIONIn June 2004, the W3C held the Workshop on Web Applications and Compound Documents .Present at this workshop were representatives of three browser vendors, web developmentdiveintohtml5.org HOW DID WE GET HERE?

32.
companies, and other W3C members. A group of interested parties, including the MozillaFoundation and Opera Soware, gave a presentation on their competing vision of the futureof the web: an evolution of the existing HTML 4 standard to include new features for modernweb application developers. e following seven principles represent what we believe to be the most critical requirements for this work. Bawards compatibility, clear migration path Web application tenologies should be based on tenologies authors are familiar with, including HTML, CSS, DOM, and JavaScript. Basic Web application features should be implementable using behaviors, scripting, and style sheets in IE6 today so that authors have a clear migration path. Any solution that cannot be used with the current high-market-share user agent without the need for binary plug-ins is highly unlikely to be successful. Well-deﬁned error handling Error handling in Web applications must be deﬁned to a level of detail where User Agents do not have to invent their own error handling meanisms or reverse engineer other User Agents’. Users should not be exposed to authoring errors Speciﬁcations must specify exact error recovery behaviour for ea possible error scenario. Error handling should for the most part be deﬁned in terms of graceful error recovery (as in CSS), rather than obvious and catastrophic failure (as in XML). Practical use Every feature that goes into the Web Applications speciﬁcations must be justiﬁed by a practical use case. e reverse is not necessarily true: every use case does not necessarily warrant a new feature. Use cases should preferably be based on real sites where the authors previously used a poor solution to work around the limitation. Scripting is here to stay But should be avoided where more convenient declarative markup can be used. Scripting should be device and presentation neutral unless scoped in a device- speciﬁc way (e.g. unless included in XBL).diveintohtml5.org HOW DID WE GET HERE?

33.
Device-speciﬁc proﬁling should be avoided Authors should be able to depend on the same features being implemented in desktop and mobile versions of the same UA. Open process e Web has beneﬁted from being developed in an open environment. Web Applications will be core to the web, and its development should also take place in the open. Mailing lists, arives and dra speciﬁcations should continuously be visible to the public.In a straw poll, the workshop participants were asked, “Should the W3C develop declarativeextension to HTML and CSS and imperative extensions to DOM, to address medium levelWeb Application requirements, as opposed to sophisticated, fully-ﬂedged OS-level APIs?(proposed by Ian Hison, Opera Soware)” e vote was 11 to 8 against. In their summary ofthe workshop, the W3C wrote, “At present, W3C does not intend to put any resources into thethird straw-poll topic: extensions to HTML and CSS for Web Applications, other thantenologies being developed under the arter of current W3C Working Groups.”Faced with this decision, the people who had proposed evolving HTML and HTML forms hadonly two oices: give up, or continue their work outside of the W3C. ey ose the laerand registered the whatwg.org domain, and in June 2004, the WHAT Working Group wasborn. ❧ WHAT WORKING GROUP?diveintohtml5.org HOW DID WE GET HERE?

34.
What the he is the WHAT Working Group? I’ll let themexplain it for themselves : e Web Hypertext Applications Tenology Working Group is a loose, unoﬃcial, and open collaboration of Web browser manufacturers and interested parties. e group aims to develop speciﬁcations based on HTML and related tenologies to ease the deployment of interoperable Web Applications, with the intention of submiing the results to a standards organisation. is submission would then form the basis of work on formally extending HTML in the standards tra. e creation of this forum follows from several months of work by private e-mail on speciﬁcations for su tenologies. e main focus up to this point has been extending HTML4 Forms to support features requested by authors, without breaking bawards compatibility with existing content. is group was created to ensure that future development of these speciﬁcations will be completely open, through a publicly-arived, open mailing list.e key phrase here is “without breaking baward compatibility.” XHTML (minus theAppendix C loophole) is not bawardly compatible with HTML. It requires an entirely newMIME type, and it mandates draconian error handling for all content served with that MIMEtype. XForms is not bawardly compatible with HTML forms, because it can only be used indocuments that are served with the new XHTML MIME type, whi means that XForms alsomandates draconian error handling. All roads lead to MIME.Instead of scrapping over a decade’s worth of investment in HTML and making 99% ofexisting web pages unusable, the WHAT Working Group decided to take a diﬀerent approa:documenting the “forgiving” error-handling algorithms that browsers actually used. Webbrowsers have always been forgiving of HTML errors, but nobody had ever bothered to writedown exactly how they did it. NCSA Mosaic had its own algorithms for dealing with brokenpages, and Netscape tried to mat them. en Internet Explorer tried to mat Netscape. enOpera and Firefox tried to mat Internet Explorer. en Safari tried to mat Firefox. And sodiveintohtml5.org HOW DID WE GET HERE?

35.
on, right up to the present day. Along the way, developers burned thousands and thousands ofhours trying to make their products compatible with their competitors’.If that sounds like an insane amount of work, that’s because it is. Or rather, it was. It tookﬁve years, but (modulo a few obscure edge cases) the WHAT Working Group successfullydocumented how to parse HTML in a way that is compatible with existing web content.Nowhere in the ﬁnal algorithm is there a step that mandates that the HTML consumer shouldstop processing and display an error message to the end user.While all that reverse-engineering was going on, the WHAT working group was quietlyworking on a few other things, too. One of them was a speciﬁcation, initially dubbed WebForms 2.0, that added new types of controls to HTML forms. (You’ll learn more about webforms in A Form of Madness.) Another was a dra speciﬁcation called “Web Applications1.0,” that included major new features like a direct-mode drawing canvas and native supportfor audio and video without plugins. ❧ BACK TO THE W3CFor two and a half years, the W3C andthe WHAT Working Group largelyignored ea other. While the WHATWorking Group focused on web forms andnew HTML features, the W3C HTMLWorking Group was busy with version 2.0of XHTML. But by October 2006, it wasclear that the WHAT Working Group hadpied up serious momentum, whileXHTML 2 was still languishing in draform, unimplemented by any majorbrowser. In October 2006, Tim Berners-Lee, the founder of the W3C itself,announced that the W3C would work together with the WHAT Working Group to evolvediveintohtml5.org HOW DID WE GET HERE?

36.
HTML. Some things are clearer with hindsight of several years. It is necessary to evolve HTML incrementally. e aempt to get the world to swit to XML, including quotes around aribute values and slashes in empty tags and namespaces all at once didn’t work. e large HTML-generating public did not move, largely because the browsers didn’t complain. Some large communities did shi and are enjoying the fruits of well-formed systems, but not all. It is important to maintain HTML incrementally, as well as continuing a transition to well-formed world, and developing more power in that world. e plan is to arter a completely new HTML group. Unlike the previous one, this one will be artered to do incremental improvements to HTML, as also in parallel xHTML. It will have a diﬀerent air and staﬀ contact. It will work on HTML and xHTML together. We have strong support for this group, from many people we have talked to, including browser makers. ere will also be work on forms. is is a complex area, as existing HTML forms and XForms are both form languages. HTML forms are ubiquitously deployed, and there are many implementations and users of XForms. Meanwhile, the Webforms submission has suggested sensible extensions to HTML forms. e plan is, informed by Webforms, to extend HTML forms.One of the ﬁrst things the newly re-artered W3C HTML Working Group decided was torename “Web Applications 1.0” to “HTML5.” And here we are, diving into HTML5. ❧ POSTSCRIPTIn October 2009, the W3C shut down the XHTML 2 Working Group and issued this statementto explain their decision:diveintohtml5.org HOW DID WE GET HERE?

37.
When W3C announced the HTML and XHTML 2 Working Groups in Mar 2007, we indicated that we would continue to monitor the market for XHTML 2. W3C recognizes the importance of a clear signal to the community about the future of HTML. While we recognize the value of the XHTML 2 Working Group’s contributions over the years, aer discussion with the participants, W3C management has decided to allow the Working Group’s arter to expire at the end of 2009 and not to renew it.e ones that win are the ones that ship. ❧ FURTHER READING e History of the Web , an old dra by Ian Hison HTML/History, by Miael Smith, Henri Sivonen, and others A Brief History of HTML , by Sco Reynen ❧is has been “How Did We Get Here?” e full table of contents has more if you’d like tokeep reading.DID YOU KNOW? In association with Google Press, O’Reilly is distributing this book in a variety of formats, including paper, ePub, Mobi, and DRM-free PDF. e paid edition is called “HTML5: Up & Running,” and it is available now. is apter is included in the paid edition.diveintohtml5.org HOW DID WE GET HERE?

38.
If you liked this apter and want to show your appreciation, you can buy “HTML5: Up & Running” with this aﬃliate link or buy an electronic edition directly from O’Reilly. You’ll get a book, and I’ll get a bu. I do not currently accept direct donations. Copyright MMIX–MMX Mark Pilgrim powered by Google™ Searchdiveintohtml5.org HOW DID WE GET HERE?

39.
You are here: Home ‣ Dive Into HTML5 ‣ №2 . DETECTING HTML5 FEATURES show table of contents ❧ DIVING IN ou may well ask: “How can I start using HTML5 if older browsers don’t support it?” But the question itself is misleading. HTML5 is not one big thing; it is a collection of individual features. So you can’t detect “HTML5 support,” because that doesn’t make any sense. But you can detect support forindividual features, like canvas, video, or geolocation. ❧ DETECTION TECHNIQUESWhen your browser renders a web page, it constructs a Document Object Model ( DOM), acollection of objects that represent the HTML elements on the page. Every element — everydiveintohtml5.org DETECTING HTML5 FEATURES

40.
<p> , every <div> , every <span> — is represented in the DOM by a diﬀerent object. (ereare also global objects, like window and document, that aren’t tied to speciﬁc elements.)All DOM objects share a set of common properties, butsome objects have more than others. In browsers thatsupport HTML5 features, certain objects will haveunique properties. A qui peek at the DOM will tellyou whi features are supported.ere are four basic teniques for detecting whether abrowser supports a particular feature. From simplest tomost complex: 1. Che if a certain property exists on a global object (su as window or navigator). Example: testing for geolocation support 2. Create an element, then e if a certain property exists on that element. Example: testing for canvas support 3. Create an element, e if a certain method exists on that element, then call the method and e the value it returns. Example: testing whi video formats are supported 4. Create an element, set a property to a certain value, then e if the property has retained its value. Example: testing whi <input> types are supported ❧diveintohtml5.org DETECTING HTML5 FEATURES

41.
MODERNIZR, AN HTML5 DETECTION LIBRARYModernizr is an open source, MIT-licensed JavaScript library that detects support for manyHTML5 & CSS3 features. At the time of writing, the latest version is 1.5. You should alwaysuse the latest version. To use it, include the following <script> element at the top of yourpage. <!DOCTYPE html> <html> <head> <meta charset="utf-8"> ↜ It goes to <title>Dive Into HTML5</title> <script src="modernizr.min.js"></script> </head> your <head> <body> ... </body> </html>Modernizr runs automatically. ere is no modernizr_init() function to call. When it runs,it creates a global object called Modernizr, that contains a set of Boolean properties forea feature it can detect. For example, if your browser supports the canvas API, theModernizr.canvas property will be true. If your browser does not support the canvasAPI, the Modernizr.canvas property will be false. if (Modernizr.canvas) { // lets draw some shapes! } else { // no native canvas support available :( } ❧diveintohtml5.org DETECTING HTML5 FEATURES

42.
CANVASHTML5 deﬁnes the <canvas> element as “aresolution-dependent bitmap canvas that can beused for rendering graphs, game graphics, orother visual images on the ﬂy.” A canvas is arectangle in your page where you can useJavaScript to draw anything you want. HTML5deﬁnes a set of functions (“the canvas API”) fordrawing shapes, deﬁning paths, creating gradients, Your browser supports the canvas API.and applying transformations.Cheing for the canvas API uses detection tenique #2. If your browser supports the canvasAPI, the DOM object it creates to represent a <canvas> element will have agetContext() method. If your browser doesn’t support the canvas API, the DOM object itcreates for a <canvas> element will only have the set of common properties, but notanything canvas-speciﬁc. function supports_canvas() { return !!document.createElement(canvas).getContext; }is function starts by creating a dummy <canvas> element. But the element is neveraaed to your page, so no one will ever see it. It’s just ﬂoating in memory, going nowhereand doing nothing, like a canoe on a lazy river. return !!document.createElement(canvas).getContext;As soon as you create the dummy <canvas> element, you test for the presence of agetContext() method. is method will only exist if your browser supports the canvasAPI. return !!document.createElement(canvas).getContext;diveintohtml5.org DETECTING HTML5 FEATURES

43.
Finally, you use the double-negative tri to force the result to a Boolean value ( true orfalse). return !!document.createElement(canvas).getContext;is function will detect support for most of the canvas API, including shapes, paths,gradients & paerns. It will not detect the third-party explorercanvas library thatimplements the canvas API in Microso Internet Explorer.Instead of writing this function yourself, you can use Modernizr to detect support for thecanvas API. ↶ check for canvas support if (Modernizr.canvas) { // lets draw some shapes! } else { // no native canvas support available :( }ere is a separate test for the canvas text API, whi I will demonstrate next. ❧ CANVAS TEXTdiveintohtml5.org DETECTING HTML5 FEATURES

44.
Even if your browser supports thecanvas API, it might not supportthe canvas text API. e canvasAPI grew over time, and the textfunctions were added late in thegame. Some browsers shippedwith canvas support before thetext API was complete.Cheing for the canvas text API Your browser supports the canvas text API.uses detection tenique #2. Ifyour browser supports the canvasAPI, the DOM object it creates to represent a <canvas> element will have thegetContext() method. If your browser doesn’t support the canvas API, the DOM object itcreates for a <canvas> element will only have the set of common properties, but notanything canvas-speciﬁc. function supports_canvas_text() { if (!supports_canvas()) { return false; } var dummy_canvas = document.createElement(canvas); var context = dummy_canvas.getContext(2d); return typeof context.fillText == function; }e function starts by eing for canvas support, using the supports_canvas() functionyou just saw in the previous section. If your browser doesn’t support the canvas API, itcertainly won’t support the canvas text API! if (!supports_canvas()) { return false; }Next, you create a dummy <canvas> element and get its drawing context. is is guaranteedto work, because the supports_canvas() function already eed that thegetContext() method exists on all canvas objects. var dummy_canvas = document.createElement(canvas); var context = dummy_canvas.getContext(2d);diveintohtml5.org DETECTING HTML5 FEATURES

45.
Finally, you e whether the drawing context has a fillText() function. If it does, thecanvas text API is available. Hooray! return typeof context.fillText == function;Instead of writing this function yourself, you can use Modernizr to detect support for thecanvas text API. ↶ check for canvas text support if (Modernizr.canvastext) { // lets draw some text! } else { // no native canvas text support available :( } ❧ VIDEOHTML5 deﬁnes a new element called <video> for embedding video in your web pages.Embedding video used to be impossible without third-party plugins su as AppleiTime® or Adobe Flash®.e <video> element is designed to be usable without anydetection scripts. You can specify multiple video ﬁles, andbrowsers that support HTML5 video will oose one basedon what video formats they support. (See “A gentleintroduction to video encoding” part 1: container formatsand part 2: lossy video codecs to learn about diﬀerent videoformats.)diveintohtml5.org DETECTING HTML5 FEATURES

46.
Browsers that don’t support HTML5 video will ignore the<video> element completely, but you can use this to youradvantage and tell them to play video through a third-partyplugin instead. Kroc Camen has designed a solution calledVideo for Everybody! that uses HTML5 video where Your browser does not supportavailable, but falls ba to iTime or Flash in older HTML5 video. :(browsers. is solution uses no JavaScript whatsoever, andit works in virtually every browser, including mobilebrowsers.If you want to do more with video than plop it on your page and play it, you’ll need to useJavaScript. Cheing for video support uses detection tenique #2. If your browser supportsHTML5 video, the DOM object it creates to represent a <video> element will have acanPlayType() method. If your browser doesn’t support HTML5 video, the DOM object itcreates for a <video> element will have only the set of properties common to all elements.You can e for video support using this function: function supports_video() { return !!document.createElement(video).canPlayType; }Instead of writing this function yourself, you can use Modernizr to detect support for HTML5video. ↶ check for HTML5 video support if (Modernizr.video) { // lets play some video! } else { // no native video support available :( // maybe check for QuickTime or Flash instead }In the Video apter, I’ll explain another solution that uses these detection teniques toconvert <video> elements to Flash-based video players, for the beneﬁt of browsers thatdon’t support HTML5 video.diveintohtml5.org DETECTING HTML5 FEATURES

47.
ere is a separate test for detecting whi video formats your browser can play, whi I willdemonstrate next. ❧ VIDEO FORMATSVideo formats are like wrien languages. An English newspaper may convey the sameinformation as a Spanish newspaper, but if you can only read English, only one of them willbe useful to you! To play a video, your browser needs to understand the “language” in whithe video was wrien. e “language” of a video is called a “codec” — this is the algorithm used to encode the video into a stream of bits. ere are dozens of codecs in use all over the world. Whi one should you use? e unfortunate reality of HTML5 video is that browsers can’t agree on a single codec. However, they seem to have narrowed it down to two. One codec costs money (because of patent licensing), but it works in Safari and on the iPhone. (is one also works in Flash if you use a solution like Video for Everybody!) e other codec is free and works Your browser does not support any in open source browsers like Chromium and Mozilla video formats. :( Firefox.Cheing for video format support uses detection tenique #3. If your browser supportsHTML5 video, the DOM object it creates to represent a <video> element will have acanPlayType() method. is method will tell you whether the browser supports aparticular video format.is function es for the patent-encumbered format supported by Macs and iPhones. function supports_h264_baseline_video() {diveintohtml5.org DETECTING HTML5 FEATURES

48.
if (!supports_video()) { return false; } var v = document.createElement("video"); return v.canPlayType(video/mp4; codecs="avc1.42E01E, mp4a.40.2"); }e function starts by eing for HTML5 video support, using the supports_video()function you just saw in the previous section. If your browser doesn’t support HTML5 video,it certainly won’t support any video formats! if (!supports_video()) { return false; }en the function creates a dummy <video> element (but doesn’t aa it to the page, so itwon’t be visible) and calls the canPlayType() method. is method is guaranteed to bethere, because the supports_video() function just eed for it. var v = document.createElement("video");A “video format” is really a combination of diﬀerent things. In tenical terms, you’re askingthe browser whether it can play H.264 Baseline video and AAC LC audio in an MPEG-4container. (I’ll explain what all that means in the Video apter. You might also be interestedin reading A gentle introduction to video encoding.) return v.canPlayType(video/mp4; codecs="avc1.42E01E, mp4a.40.2" );e canPlayType() function doesn’t return true or false. In recognition of howcomplex video formats are, the function returns a string: "probably" if the browser is fairly conﬁdent it can play this format "maybe" if the browser thinks it might be able to play this format "" (an empty string) if the browser is certain it can’t play this formatis second function es for the open video format supported by Mozilla Firefox and otheropen source browsers. e process is exactly the same; the only diﬀerence is the string youpass in to the canPlayType() function. In tenical terms, you’re asking the browserdiveintohtml5.org DETECTING HTML5 FEATURES

50.
❧ LOCAL STORAGEHTML5 storage provides a way for web sites to store information onyour computer and retrieve it later. e concept is similar to cookies,but it’s designed for larger quantities of information. Cookies arelimited in size, and your browser sends them ba to the web serverevery time it requests a new page (whi takes extra time and preciousbandwidth). HTML5 storage stays on your computer, and web sites canaccess it with JavaScript aer the page is loaded. Your browser does not support HTML5 storage. :(ASK PROFESSOR MARKUP ☞ Q: Is local storage really part of HTML5? Why is it in a separate speciﬁcation? A: e short answer is yes, local storage is part of HTML5. e slightly longer answer is that local storage used to be part of the main HTML5 speciﬁcation, but it was split out into a separate speciﬁcation because some people in the HTML5 Working Group complained that HTML5 was too big. If that sounds like slicing a pie into more pieces to reduce the totaldiveintohtml5.org DETECTING HTML5 FEATURES

51.
number of calories… well, welcome to the way world of standards.Cheing for HTML5 storage support uses detection tenique #1. If your browser supportsHTML5 storage, there will be a localStorage property on the global window object. Ifyour browser doesn’t support HTML5 storage, the localStorage property will beundeﬁned. Due to an unfortunate bug in older versions of Firefox, this test will raise anexception if cookies are disabled, so the entire test is wrapped in a try..catch statement. function supports_local_storage() { try { return localStorage in window && window[localStorage] !== null; } catch(e){ return false; } }Instead of writing this function yourself, you can use Modernizr (1.1 or later) to detect supportfor HTML5 local storage. ↶ check for HTML5 local storage if (Modernizr.localstorage) { // window.localStorage is available! } else { // no native support for local storage :( // maybe try Gears or another third-party solution }Note that JavaScript is case-sensitive. e Modernizr aribute is called localstorage (alllowercase), but the DOM property is called window.localStorage (mixed case).diveintohtml5.org DETECTING HTML5 FEATURES

52.
ASK PROFESSOR MARKUP ☞ Q: How secure is my HTML5 storage database? Can anyone read it? A: Anyone who has physical access to your computer can probably look at (or even ange) your HTML5 storage database. Within your browser, any web site can read and modify its own values, but sites can’t access values stored by other sites. is is called a same-origin restriction. ❧ WEB WORKERSWeb Workers provide a standard way for browsers torun JavaScript in the baground. With web workers, Your browser supports web workers.you can spawn multiple “threads” that all run at thesame time, more or less. (ink of how yourcomputer can run multiple applications at the same time, and you’re most of the way there.)ese “baground threads” can do complex mathematical calculations, make network requests,or access local storage while the main web page responds to the user scrolling, cliing, ortyping.Cheing for web workers uses detection tenique #1. If your browser supports the WebWorker API, there will be a Worker property on the global window object. If your browserdoesn’t support the Web Worker API, the Worker property will be undeﬁned.diveintohtml5.org DETECTING HTML5 FEATURES

53.
function supports_web_workers() { return !!window.Worker; }Instead of writing this function yourself, you can use Modernizr (1.1 or later) to detect supportfor web workers. ↶ check for web workers if (Modernizr.webworkers) { // window.Worker is available! } else { // no native support for web workers :( // maybe try Gears or another third-party solution }Note that JavaScript is case-sensitive. e Modernizr aribute is called webworkers (alllowercase), but the DOM object is called window.Worker (with a capital “W” in “Worker”). ❧ OFFLINE WEB APPLICATIONSReading static web pages oﬄine is easy: connect to theInternet, load a web page, disconnect from the Internet,drive to a secluded cabin, and read the web page atyour leisure. (To save time, you may wish to skip thestep about the cabin.) But what about web applicationslike Gmail or Google Docs? anks to HTML5, anyone(not just Google!) can build a web application thatworks oﬄine.Oﬄine web applications start out as online web Your browser supports oﬄine webdiveintohtml5.org applications. DETECTING HTML5 FEATURES

54.
applications. e ﬁrst time you visit an oﬄine-enabled applications.web site, the web server tells your browser whi ﬁlesit needs in order to work oﬄine. ese ﬁles can be anything — HTML, JavaScript, images,even videos. Once your browser downloads all the necessary ﬁles, you can revisit the website even if you’re not connected to the Internet. Your browser will notice that you’re oﬄineand use the ﬁles it has already downloaded. When you get ba online, any anges you’vemade can be uploaded to the remote web server.Cheing for oﬄine support uses detection tenique #1. If your browser supports oﬄine webapplications, there will be an applicationCache property on the global window object. Ifyour browser doesn’t support oﬄine web applications, the applicationCache property willbe undeﬁned. You can e for oﬄine support with the following function: function supports_offline() { return !!window.applicationCache; }Instead of writing this function yourself, you can use Modernizr (1.1 or later) to detect supportfor oﬄine web applications. ↶ check for offline support if (Modernizr.applicationcache) { // window.applicationCache is available! } else { // no native support for offline :( // maybe try Gears or another third-party solution }Note that JavaScript is case-sensitive. e Modernizr aribute is called applicationcache(all lowercase), but the DOM object is called window.applicationCache (mixed case). ❧diveintohtml5.org DETECTING HTML5 FEATURES

55.
GEOLOCATIONGeolocation is the art of ﬁguring out where you are in the world and (optionally) sharing thatinformation with people you trust. ere is more than one way to ﬁgure out where you are —your IP address, your wireless network connection, whi cell tower your phone is talking to,or dedicated GPS hardware that calculates latitude and longitude from information sent bysatellites in the sky. Your browser does not support geolocation. :(ASK PROFESSOR MARKUP ☞ Q: Is geolocation part of HTML5? Why are you talking about it? A: Geolocation support is being added to browsers right now, along with support for new HTML5 features. Strictly speaking, geolocation is being standardized by the Geolocation Working Group , whi is separate from the HTML5 Working Group. But I’m going to talk about geolocation in this bookdiveintohtml5.org DETECTING HTML5 FEATURES

56.
anyway, because it’s part of the evolution of the web that’s happening now.Cheing for geolocation support uses detection tenique #1. If your browser supports thegeolocation API, there will be a geolocation property on the global navigator object. Ifyour browser doesn’t support the geolocation API, the geolocation property will beundeﬁned. Here’s how to e for geolocation support: function supports_geolocation() { return !!navigator.geolocation; }Instead of writing this function yourself, you can use Modernizr to detect support for thegeolocation API. ↶ check for geolocation support if (Modernizr.geolocation) { // lets find out where you are! } else { // no native geolocation support available :( // maybe try Gears or another third-party solution }If your browser does not support the geolocation API natively, there is still hope. Gears is anopen source browser plugin from Google that works on Windows, Mac, Linux, WindowsMobile, and Android. It provides features for older browsers that do not support all the fancynew stuﬀ we’ve discussed in this apter. One of the features that Gears provides is ageolocation API. It’s not the same as the navigator.geolocation API, but it serves thesame purpose.ere are also device-speciﬁc geolocation APIs on older mobile phone platforms, includingBlaBerry, Nokia, Palm, and OMTP BONDI.diveintohtml5.org DETECTING HTML5 FEATURES

58.
is will prove to be vitally important. var i = document.createElement("input");Next, set the type aribute on the dummy <input> element to the input type you want todetect. i.setAttribute("type", "color");If your browser supports that particular input type, the type property will retain the valueyou set. If your browser doesn’t support that particular input type, it will ignore the valueyou set and the type property will still be "text". return i.type !== "text";Instead of writing 13 separate functions yourself, you can use Modernizr to detect support forall the new input types deﬁned in HTML5. Modernizr reuses a single <input> element toeﬃciently detect support for all 13 input types. en it builds a hash calledModernizr.inputtypes , that contains 13 keys (the HTML5 type aributes) and 13Boolean values (true if supported, false if not). ↶ check for native date picker if (!Modernizr.inputtypes.date) { // no native support for <input type="date"> :( // maybe build one yourself with Dojo or jQueryUI } ❧ PLACEHOLDER TEXTBesides new input types , HTML5diveintohtml5.org DETECTING HTML5 FEATURES

59.
includes several small tweaks to Your browser supports placeholder textexisting forms. One improvementis the ability to set placeholdertext in an input ﬁeld. Placeholder text is displayed inside the input ﬁeld as long as the ﬁeld isempty and not focused. As soon you cli on (or tab to) the input ﬁeld, the placeholder textdisappears. e apter on web forms has screenshots if you’re having trouble visualizing it.Cheing for placeholder support uses detection tenique #2. If your browser supportsplaceholder text in input ﬁelds, the DOM object it creates to represent an <input> elementwill have a placeholder property (even if you don’t include a placeholder aribute inyour HTML). If your browser doesn’t support placeholder text, the DOM object it creates foran <input> element will not have a placeholder property. function supports_input_placeholder() { var i = document.createElement(input); return placeholder in i; }Instead of writing this function yourself, you can use Modernizr (1.1 or later) to detect supportfor placeholder text. ↶ check for placeholder text if (Modernizr.input.placeholder) { // your placeholder text should already be visible! } else { // no placeholder support :( // fall back to a scripted solution } ❧ FORM AUTOFOCUSdiveintohtml5.org DETECTING HTML5 FEATURES

60.
Web sites can use JavaScript to focus the ﬁrst input ﬁeld of aweb form automatically. For example, the home page ofGoogle.com will autofocus the input box so you can type yoursear keywords without having to position the cursor in thesear box. While this is convenient for most people, it can beannoying for power users or people with special needs. If youpress the space bar expecting to scroll the page, the page will notscroll because the focus is already in a form input ﬁeld. (It typesa space in the ﬁeld instead of scrolling.) If you focus a diﬀerentinput ﬁeld while the page is still loading, the site’s autofocus Your browser supports formscript may “helpfully” move the focus ba to the original input autofocus.ﬁeld upon completion, disrupting your ﬂow and causing you totype in the wrong place.Because the autofocusing is done with JavaScript, it can be triy to handle all of these edgecases, and there is lile recourse for people who don’t want a web page to “steal” the focus.To solve this problem, HTML5 introduces an autofocus aribute on all web form controls.e autofocus aribute does exactly what it says on the tin: it moves the focus to aparticular input ﬁeld. But because it’s just markup instead of a script, the behavior will beconsistent across all web sites. Also, browser vendors (or extension authors) can oﬀer users away to disable the autofocusing behavior.Cheing for autofocus support uses detection tenique #2. If your browser supportsautofocusing web form controls, the DOM object it creates to represent an <input> elementwill have an autofocus property (even if you don’t include the autofocus aribute inyour HTML). If your browser doesn’t support autofocusing web form controls, the DOMobject it creates for an <input> element will not have an autofocus property. You candetect autofocus support with this function: function supports_input_autofocus() { var i = document.createElement(input); return autofocus in i; }Instead of writing this function yourself, you can use Modernizr (1.1 or later) to detect supportdiveintohtml5.org DETECTING HTML5 FEATURES

61.
for autofocused form ﬁelds. ↶ check for autofocus support if (Modernizr.input.autofocus) { // autofocus works! } else { // no autofocus support :( // fall back to a scripted solution } ❧ MICRODATAMicrodata is a standardized way to provideadditional semantics in your web pages. Forexample, you can use microdata to declare that aphotograph is available under a speciﬁc CreativeCommons license. As you’ll see in thedistributed extensibility apter, you can usemicrodata to mark up an “About Me” page.Browsers, browser extensions, and sear enginescan convert your HTML5 microdata markup into Your browser does not support the HTML5a vCard, a standard format for sharing contact microdata API. :(information. You can also deﬁne your ownmicrodata vocabularies.e HTML5 microdata standard includes both HTML markup (primarily for sear engines)and a set of DOM functions (primarily for browsers). ere’s no harm in including microdatamarkup in your web pages. It’s nothing more than a few well-placed aributes, and searengines that don’t understand the microdata aributes will just ignore them. But if you needto access or manipulate microdata through the DOM, you’ll need to e whether thediveintohtml5.org DETECTING HTML5 FEATURES

62.
browser supports the microdata DOM API.Cheing for HTML5 microdata API support uses detection tenique #1. If your browsersupports the HTML5 microdata API, there will be a getItems() function on the globaldocument object. If your browser doesn’t support microdata, the getItems() function willbe undeﬁned. function supports_microdata_api() { return !!document.getItems; }Modernizr does not yet support eing for the microdata API, so you’ll need to use thefunction like the one listed above. FURTHER READINGSpeciﬁcations and standards: the <canvas> element the <video> element <input> types the <input placeholder> aribute the <input autofocus> aribute HTML5 storage Web Workers Oﬄine web applications Geolocation APIJavaScript libraries: Modernizr, an HTML5 detection library geo.js, a geolocation API wrapperOther articles and tutorials:diveintohtml5.org DETECTING HTML5 FEATURES

63.
Video for Everybody! A gentle introduction to video encoding Video type parameters e All-In-One Almost-Alphabetical No-Bullshit Guide to Detecting Everything ❧is has been “Detecting HTML5 Features.” e full table of contents has more if you’d like tokeep reading.DID YOU KNOW? In association with Google Press, O’Reilly is distributing this book in a variety of formats, including paper, ePub, Mobi, and DRM-free PDF. e paid edition is called “HTML5: Up & Running,” and it is available now. is apter is included in the paid edition. If you liked this apter and want to show your appreciation, you can buy “HTML5: Up & Running” with this aﬃliate link or buy an electronic edition directly from O’Reilly. You’ll get a book, and I’ll get a bu. I do not currently accept direct donations. Copyright MMIX–MMX Mark Pilgrim powered by Google™ Searchdiveintohtml5.org DETECTING HTML5 FEATURES

64.
You are here: Home ‣ Dive Into HTML5 ‣ №3 . WHAT DOES IT ALL MEAN? show table of contents ❧ DIVING IN his apter will take an HTML page that has absolutely nothing wrong with it, and improve it. Parts of it will become shorter. Parts will become longer. All of it will become more semantic. It’ll be awesome.Here is the page in question. Learn it. Live it. Love it. Open it in a new tab and don’t comeba until you’ve hit “View Source” at least once. ❧ THE DOCTYPEFrom the top: <!DOCTYPE htmldiveintohtml5.org WHAT DOES IT ALL MEAN?

65.
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">is is called the “doctype.” ere’s a long history — and a bla art — behind the doctype.While working on Internet Explorer 5 for Mac, the developers at Microso found themselveswith a surprising problem. e upcoming version of their browser had improved its standardssupport so mu, older pages no longer rendered properly. Or rather, they rendered properly(according to speciﬁcations), but people expected them to render improperly. e pagesthemselves had been authored based on the quirks of the dominant browsers of the day,primarily Netscape 4 and Internet Explorer 4. IE5/Mac was so advanced, it actually broke theweb.Microso came up with a novel solution. Before rendering a page, IE5/Mac looked at the“doctype,” whi is typically the ﬁrst line of the HTML source (even before the <html>element). Older pages (that relied on the rendering quirks of older browsers) generally didn’thave a doctype at all. IE5/Mac rendered these pages like older browsers did. In order to“activate” the new standards support, web page authors had to opt in, by supplying the rightdoctype before the <html> element.is idea spread like wildﬁre, and soon all major browsers had two modes: “quirks mode” and“standards mode.” Of course, this being the web, things quily got out of hand. WhenMozilla tried to ship version 1.1 of their browser, they discovered that there were pages beingrendered in “standards mode” that were actually relying on one speciﬁc quirk. Mozilla hadjust ﬁxed its rendering engine to eliminate this quirk, and thousands of pages broke all atonce. us was created — and I am not making this up — “almost standards mode.”In his seminal work, Activating Browser Modes with Doctype , Henri Sivonen summarizes thediﬀerent modes: irks Mode In the irks mode, browsers violate contemporary Web format speciﬁcations in order to avoid “breaking” pages authored according to practices that were prevalent in the late 1990s. Standards Mode In the Standards mode, browsers try to give conforming documents the speciﬁcation-wise correct treatment to the extent implemented in a particulardiveintohtml5.org WHAT DOES IT ALL MEAN?

66.
browser. HTML5 calls this mode the “no quirks mode.” Almost Standards Mode Firefox, Safari, Chrome, Opera (since 7.5) and IE8 also have a mode known as “Almost Standards mode,” that implements the vertical sizing of table cells traditionally and not rigorously according to the CSS2 speciﬁcation. HTML5 calls this mode the “limited quirks mode.”(You should read the rest of Henri’s article, because I’m simplifying immensely here. Even inIE5/Mac, there were a few older doctypes that didn’t count as far as opting into standardssupport. Over time, the list of quirks grew, and so did the list of doctypes that triggered“quirks mode.” e last time I tried to count, there were 5 doctypes that triggered “almoststandards mode,” and 73 that triggered “quirks mode.” But I probably missed some, and I’mnot even going to talk about the crazy shit that Internet Explorer 8 does to swit between itsfour — four! — diﬀerent rendering modes. Here’s a ﬂowart. Kill it. Kill it with ﬁre.)Now then. Where were we? Ah yes, the doctype: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">at happens to be one of the 15 doctypes that trigger “standards mode” in all modernbrowsers. ere is nothing wrong with it. If you like it, you can keep it. Or you can ange itto the HTML5 doctype, whi is shorter and sweeter and also triggers “standards mode” in allmodern browsers.is is the HTML5 doctype: <!DOCTYPE html>at’s it. Just 15 aracters. It’s so easy, you can type it by hand and not screw it up. ❧diveintohtml5.org WHAT DOES IT ALL MEAN?

67.
THE ROOT ELEMENT An HTML page is a series of nested elements. e entire structure of the page is like a tree. Some elements are “siblings,” like two branes that extend from the same tree trunk. Some elements can be “ildren” of other elements, like a smaller bran that extends from a larger bran. (It works the other way too; an element that contains other elements is called the “parent” node of its immediate ild elements, and the “ancestor” of its grandildren.) Elements that have no ildren are called “leaf” nodes. e outer-most element, whi is the ancestor of all other elements on the page, is called the “root element.” e root element of an HTML page is always <html>. In this example page, the root element looks likethis: <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">ere is nothing wrong with this markup. Again, if you like it, you can keep it. It is validHTML5. But parts of it are no longer necessary in HTML5, so you can save a few bytes byremoving them.e ﬁrst thing to discuss is the xmlns aribute. is is a vestige of XHTML 1.0. It says thatelements in this page are in the XHTML namespace, http://www.w3.org/1999/xhtml.But elements in HTML5 are always in this namespace, so you no longer need to declare itexplicitly. Your HTML5 page will work exactly the same in all browsers, whether this aributeis present or not.diveintohtml5.org WHAT DOES IT ALL MEAN?

68.
Dropping the xmlns aribute leaves us with this root element: <html lang="en" xml:lang="en">e two aributes here, lang and xml:lang, both deﬁne the language of this HTML page.(en stands for “English.” Not writing in English? Find your language code.) Why twoaributes for the same thing? Again, this is a vestige of XHTML. Only the lang aribute hasany eﬀect in HTML5. You can keep the xml:lang aribute if you like, but if you do, youneed to ensure that it contains the same value as the lang aribute. To ease migration to and from XHTML, authors may specify an aribute in no namespace with no preﬁx and with the literal localname "xml:lang" on HTML elements in HTML documents, but su aributes must only be speciﬁed if a lang aribute in no namespace is also speciﬁed, and both aributes must have the same value when compared in an ASCII case-insensitive manner. e aribute in no namespace with no preﬁx and with the literal localname "xml:lang" has no eﬀect on language processing.Are you ready to drop it? It’s OK, just let it go. Going, going… gone! at leaves us with thisroot element: <html lang="en">And that’s all I have to say about that. ❧ THE <HEAD> ELEMENTdiveintohtml5.org WHAT DOES IT ALL MEAN?

69.
e ﬁrst ild of the root element is usually the <head> element. e <head> elementcontains metadata — information about the page, rather than the body of the page itself. (ebody of the page is, unsurprisingly, contained in the <body> element.) e <head> elementitself is rather boring, and it hasn’t anged in any interesting way in HTML5. e good stuﬀis what’s inside the <head> element. And for that, we turn once again to our example page: <head> <meta http-equiv="Content-Type" content="text/html; charset=utf- 8" /> <title>My Weblog</title> <link rel="stylesheet" type="text/css" href="style-original.css" /> <link rel="alternate" type="application/atom+xml" title="My Weblog feed" href="/feed/" /> <link rel="search" type="application/opensearchdescription+xml" title="My Weblog search" href="opensearch.xml" /> <link rel="shortcut icon" href="/favicon.ico" /> </head>First up: the <meta> element. ❧ CHARACTER ENCODINGdiveintohtml5.org WHAT DOES IT ALL MEAN?

70.
When you think of “text,” you probably think of “aracters and symbols I see on mycomputer screen.” But computers don’t deal in aracters and symbols; they deal in bits andbytes. Every piece of text you’ve ever seen on a computer screen is actually stored in aparticular aracter encoding . ere are hundreds of diﬀerent aracter encodings , someoptimized for particular languages like Russian or Chinese or English, and others that can beused for multiple languages. Roughly speaking, the aracter encoding provides a mappingbetween the stuﬀ you see on your screen and the stuﬀ your computer actually stores inmemory and on disk.In reality, it’s more complicated than that. e same aracter might appear in more than oneencoding, but ea encoding might use a diﬀerent sequence of bytes to actually store thearacter in memory or on disk. So, you can think of the aracter encoding as a kind ofdecryption key for the text. Whenever someone gives you a sequence of bytes and claims it’s“text,” you need to know what aracter encoding they used so you can decode the bytes intoaracters and display them (or process them, or whatever).So, how does your browser actually determine the aracter encoding of the stream of bytesthat a web server sends? I’m glad you asked. If you’re familiar with HTTP headers, you mayhave seen a header like this: Content-Type: text/html; charset="utf-8"Brieﬂy, this says that the web server thinks it’s sending you an HTML document, and that itthinks the document uses the UTF-8 aracter encoding. Unfortunately, in the wholemagniﬁcent soup of the World Wide Web, few authors actually have control over their HTTPserver. ink Blogger: the content is provided by individuals, but the servers are run byGoogle. So HTML 4 provided a way to specify the aracter encoding in the HTML documentitself. You’ve probably seen this too: <meta http-equiv="Content-Type" content="text/html; charset=utf-8">Brieﬂy, this says that the web author thinks they have authored an HTML document using theUTF-8 aracter encoding.Both of these teniques still work in HTML5. e HTTP header is the preferred method, anddiveintohtml5.org WHAT DOES IT ALL MEAN?

71.
it overrides the <meta> tag if present. But not everyone can set HTTP headers, so the<meta> tag is still around. In fact, it got a lile easier in HTML5. Now it looks like this: <meta charset="utf-8" />is works in all browsers. How did this shortened syntax come about? Here is the bestexplanation I could ﬁnd: e rationale for the <meta charset=""> aribute combination is that UAs already implement it, because people tend to leave things unquoted, like: <META HTTP-EQUIV=Content-Type CONTENT=text/html; charset=ISO- 8859-1>ere are even a few <meta charset> test cases if you don’t believe that browsers alreadydo this.ASK PROFESSOR MARKUP ☞ Q: I never use funny aracters. Do I still need to declare my aracter encoding? A: Yes! You should always specify a aracter encoding on every HTML page you serve. Not specifying an encoding can lead to security vulnerabilities.To sum up: aracter encoding is complicated, and it has not been made any easier by decadesof poorly wrien soware used by copy-and-paste–educated authors. You should alwaysspecify a aracter encoding on every HTML document, or bad things will happen. You cando it with the HTTP Content-Type header, the <meta http-equiv> declaration, or theshorter <meta charset> declaration, but please do it. e web thanks you.diveintohtml5.org WHAT DOES IT ALL MEAN?

72.
shorter <meta charset> declaration, but please do it. e web thanks you. ❧ FRIENDS & (LINK) RELATIONSRegular links ( <a href>) simply point to another page. Link relations are a way to explainwhy you’re pointing to another page. ey ﬁnish the sentence “I’m pointing to this other pagebecause…” …it’s a stylesheet containing CSS rules that your browser should apply to this document. …it’s a feed that contains the same content as this page, but in a standard subscribable format. …it’s a translation of this page into another language. …it’s the same content as this page, but in PDF format. …it’s the next apter of an online book of whi this page is also a part.And so on. HTML5 breaks link relations into two categories: Two categories of links can be created using the link element. Links to external resources are links to resources that are to be used to augment the current document, and hyperlink links are links to other documents. … e exact behavior for links to external resources depends on the exact relationship, as deﬁned for the relevant link type.Of the examples I just gave, only the ﬁrst ( rel="stylesheet") is a link to an externalresource. e rest are hyperlinks to other documents. You may wish to follow those links, oryou may not, but they’re not required in order to view the current page.Most oen, link relations are seen on <link> elements within the <head> of a page. Somelink relations can also be used on <a> elements, but this is uncommon even when allowed.HTML5 also allows some relations on <area> elements, but this is even less common.(HTML 4 did not allow a rel aribute on <area> elements.) See the full art of linkdiveintohtml5.org WHAT DOES IT ALL MEAN?

73.
relations to e where you can use speciﬁc rel values.ASK PROFESSOR MARKUP ☞ Q: Can I make up my own link relations? A: ere seems to be an inﬁnite supply of ideas for new link relations. In an aempt to prevent people from just making shit up , the WHATWG maintains a registry of proposed rel values and deﬁnes the process for geing them accepted. REL = STYLESHEETLet’s look at the ﬁrst link relation in our example page: <link rel="stylesheet" href="style-original.css" type="text/css" />is is the most frequently used link relation in the world (literally). <linkrel="stylesheet"> is for pointing to CSS rules that are stored in a separate ﬁle. Onesmall optimization you can make in HTML5 is to drop the type aribute. ere’s only onestylesheet language for the web, CSS, so that’s the default value for the type aribute. isworks in all browsers. (I suppose someone could invent a new stylesheet language someday,but if that happens, just add the type aribute ba.) <link rel="stylesheet" href="style-original.css" /> REL = ALTERNATEdiveintohtml5.org WHAT DOES IT ALL MEAN?

74.
Continuing with our example page: <link rel="alternate" type="application/atom+xml" title="My Weblog feed" href="/feed/" />is link relation is also quite common. <link rel="alternate">, combined with eitherthe RSS or Atom media type in the type aribute, enables something called “feedautodiscovery.” It allows syndicated feed readers (like Google Reader) to discover that a sitehas a news feed of the latest articles. Most browsers also support feed autodiscovery bydisplaying a special icon next to the URL. (Unlike with rel="stylesheet", the typearibute maers here. Don’t drop it!)e rel="alternate" link relation has always been a strange hybrid of use cases, even inHTML 4. In HTML5, its deﬁnition has been clariﬁed and extended to more accurately describeexisting web content. As you just saw, using rel="alternate" in conjunction withtype=application/atom+xml indicates an Atom feed for the current page. But you canalso use rel="alternate" in conjunction with other type aributes to indicate the samecontent in another format, like PDF.HTML5 also puts to rest a long-standing confusion about how to link to translations ofdocuments. HTML 4 says to use the lang aribute in conjunction with rel="alternate"to specify the language of the linked document, but this is incorrect. e HTML 4 Erratadocument lists four outright errors in the HTML 4 speciﬁcation. One of these outright errorsis how to specify the language of a document linked with rel="alternate" e correctway, described in the HTML 4 Errata and now in HTML5, is to use the hreflang aribute.Unfortunately, these errata were never re-integrated into the HTML 4 spec, because no one inthe W3C HTML Working Group was working on HTML anymore. OTHER LINK RELATIONS IN HTML5rel="arives" “indicates that the referenced document describes a collection of records,documents, or other materials of historical interest. A blog’s index page could link to an indexof the blog’s past posts with rel="arives".”diveintohtml5.org WHAT DOES IT ALL MEAN?

75.
rel="author" is used to link to information about the author of the page. is can be amailto: address, though it doesn’t have to be. It could simply link to a contact form or“about the author” page.rel="external" “indicates that the link is leading to a document that is not part of the site thatthe current document forms a part of.” I believe it was ﬁrst popularized by WordPress, whiuses it on links le by commenters.HTML 4 deﬁned rel="start", rel="prev",and rel="next" to deﬁne relations betweenpages that are part of a series (like apters ofa book, or even posts on a blog). e only onethat was ever used correctly was rel="next".People used rel="previous" instead ofrel="prev"; they used rel="begin" andrel="first" instead of rel="start"; theyused rel="end" instead of rel="last". Oh,and — all by themselves — they made uprel="up" to point to a “parent” page.HTML5 includes rel="first", whi was themost common variation of the diﬀerent ways tosay “ﬁrst page in a series.” (rel="start" is anon-conforming synonym, provided for baward compatibility.) It also includes rel="prev"and rel="next", just like HTML 4, and supports rel="previous" for bawardcompatibility, as well as rel="last" (the last in a series, mirroring rel="first") andrel="up".e best way to think of rel="up" is to look at your breadcrumb navigation (or at leastimagine it). Your home page is probably the ﬁrst page in your breadcrumbs, and the currentpage is at the tail end. rel="up" points to the next-to-last page in the breadcrumbs.rel="icon" is the second most popular link relation , aer rel="stylesheet". It is usuallyfound together with shortcut, like so: <link rel="shortcut icon" href="/favicon.ico">diveintohtml5.org WHAT DOES IT ALL MEAN?

76.
All major browsers support this usage to associate a small icon with the page. Usually it’sdisplayed in the browser’s location bar next to the URL, or in the browser tab, or both.Also new in HTML5: the sizes aribute can be used in conjunction with the iconrelationship to indicate the size of the referenced icon.rel="license" was invented by the microformats community. It “indicates that the referenceddocument provides the copyright license terms under whi the current document is provided.”rel="nofollow" “indicates that the link is not endorsed by the original author or publisher ofthe page, or that the link to the referenced document was included primarily because of acommercial relationship between people aﬃliated with the two pages.” It was invented byGoogle and standardized within the microformats community. WordPress addsrel="nofollow" to links added by commenters. e thinking was that if “nofollow” linksdid not pass on PageRank, spammers would give up trying to post spam comments onweblogs. at didn’t happen, but rel="nofollow" persists.rel="noreferrer" “indicates that no referrer information is to be leaked when following thelink.” No shipping browser currently supports this, but support was recently added to WebKitnightlies, so it will eventually be showing up in Safari, Google Chrome, and other WebKit-based browsers. [rel="noreferrer" test case]rel="pingba" speciﬁes the address of a “pingba” server. As explained in the Pingbaspeciﬁcation, “e pingba system is a way for a blog to be automatically notiﬁed whenother Web sites link to it. … It enables reverse linking — a way of going ba up a ain oflinks rather than merely drilling down.” Blogging systems, notably WordPress, implement thepingba meanism to notify authors that you have linked to them when creating a new blogpost. rel="prefet" “indicates that preemptively feting and caing the speciﬁed resource is likely to be beneﬁcial, as it is highly likely that the user will require this resource.” Sear engines sometimes add <link rel="prefetch" href="URL of topdiveintohtml5.org WHAT DOES IT ALL MEAN?

77.
search result"> to the sear results page if they feel that the top result is wildly more popular than any other. For example: using Firefox, sear Google for CNN, view the page source, and sear for the keyword prefetch. Mozilla Firefox is the only current browser that supports rel="prefetch". rel="sear" “indicates that the referenced documentprovides an interface speciﬁcally for searing the document and its related resources.”Speciﬁcally, if you want rel="search" to do anything useful, it should point to anOpenSear document that describes how a browser could construct a URL to sear thecurrent site for a given keyword. OpenSear (and rel="search" links that point toOpenSear description documents) has been supported in Microso Internet Explorer sinceversion 7 and Mozilla Firefox since version 2.rel="sidebar" “indicates that the referenced document, if retrieved, is intended to be shown in asecondary browsing context (if possible), instead of in the current browsing context.” Whatdoes that mean? In Opera and Mozilla Firefox, it means “when I cli this link, prompt theuser to create a bookmark that, when selected from the Bookmarks menu, opens the linkeddocument in a browser sidebar.” (Opera actually calls it the “panel” instead of the “sidebar.”)Internet Explorer, Safari, and Chrome ignore rel="sidebar" and just treat it as a regularlink. [rel="sidebar" test case]rel="tag" “indicates that the tag that the referenced document represents applies to the currentdocument.” Marking up “tags” (category keywords) with the rel aribute was invented byTenorati to help them categorize blog posts. Early blogs and tutorials thus referred to themas “Tenorati tags.” (You read that right: a commercial company convinced the entire world toadd metadata that made the company’s job easier. Nice work if you can get it!) e syntaxwas later standardized within the microformats community, where it was simply calledrel="tag" . Most blogging systems that allow associating categories, keywords, or tags withindividual posts will mark them up with rel="tag" links. Browsers do not do anythingspecial with them; they’re really designed for sear engines to use as a signal of what thepage is about. ❧diveintohtml5.org WHAT DOES IT ALL MEAN?

78.
NEW SEMANTIC ELEMENTS IN HTML5HTML5 is not just about making existing markup shorter (although it does a fair amount ofthat). It also deﬁnes new semantic elements.<section> e section element represents a generic document or application section. A section, in this context, is a thematic grouping of content, typically with a heading. Examples of sections would be apters, the tabbed pages in a tabbed dialog box, or the numbered sections of a thesis. A Web sites home page could be split into sections for an introduction, news items, contact information.<nav> e nav element represents a section of a page that links to other pages or to parts within the page: a section with navigation links. Not all groups of links on a page need to be in a nav element — only sections that consist of major navigation blos are appropriate for the nav element. In particular, it is common for footers to have a short list of links to common pages of a site, su as the terms of service, the home page, and a copyright page. e footer element alone is suﬃcient for su cases, without a nav element.<article> e article element represents a component of a page that consists of a self-contained composition in a document, page, application, or site and that is intended to be independently distributable or reusable, e.g. in syndication. is could be a forum post, a magazine or newspaper article, a Web log entry, a user-submied comment, an interactive widget or gadget, or any other independent item of content.<aside> e aside element represents a section of a page that consists of content that is tangentially related to the content around the aside element, and whi could be considered separate from that content. Su sections are oen represented as sidebars in printed typography. e element can be used for typographical eﬀects like pull quotes or sidebars, for advertising, for groupsdiveintohtml5.org WHAT DOES IT ALL MEAN?

79.
of nav elements, and for other content that is considered separate from the main content of the page.<hgroup> e hgroup element represents the heading of a section. e element is used to group a set of h1–h6 elements when the heading has multiple levels, su as subheadings, alternative titles, or taglines.<header> e header element represents a group of introductory or navigational aids. A header element is intended to usually contain the section’s heading (an h1 –h6 element or an hgroup element), but this is not required. e header element can also be used to wrap a section’s table of contents, a sear form, or any relevant logos.<footer> e footer element represents a footer for its nearest ancestor sectioning content or sectioning root element. A footer typically contains information about its section su as who wrote it, links to related documents, copyright data, and the like. Footers don’t necessarily have to appear at the end of a section, though they usually do. When the footer element contains entire sections, they represent appendices, indexes, long colophons, verbose license agreements, and other su content.<time> e time element represents either a time on a 24 hour clo, or a precise date in the proleptic Gregorian calendar, optionally with a time and a time- zone oﬀset.<mark> e mark element represents a run of text in one document marked or highlighted for reference purposes.I know you’re anxious to start using these new elements, otherwise you wouldn’t be readingthis apter. But ﬁrst we need to take a lile detour. ❧diveintohtml5.org WHAT DOES IT ALL MEAN?

80.
A LONG DIGRESSION INTO HOW BROWSERS HANDLE UNKNOWN ELEMENTSEvery browser has a master list of HTML elements that it supports. For example, MozillaFirefox’s list is stored in nsElementTable.cpp. Elements not in this list are treated as“unknown elements.” ere are two fundamental problems with unknown elements: 1. How should the element be styled? By default, <p> has spacing on the top and boom, <blockquote> is indented with a le margin, and <h1> is displayed in a larger font. But what default styles should be applied to unknown elements? 2. What should the element’s DOM look like? Mozilla’s nsElementTable.cpp includes information about what kinds of other elements ea element can contain. If you include markup like <p><p>, the second paragraph element implicitly closes the ﬁrst one, so the elements end up as siblings, not parent-and-ild. But if you write <p><span> , the span does not close the paragraph, because Firefox knows that <p> is a blo element that can contain the inline element <span>. So, the <span> ends up as a ild of the <p> in the DOM.Diﬀerent browsers answer these questions in diﬀerent ways. (Shoing, I know.) Of the majorbrowsers, Microso Internet Explorer’s answer to both questions is the most problematic, butevery browser needs a lile bit of help here.e ﬁrst question should be relatively simple to answer: don’t give any special styling tounknown elements. Just let them inherit whatever CSS properties are in eﬀect wherever theyappear on the page, and let the page author specify all styling with CSS. And that works,mostly, but there’s one lile gota you need to be aware of.PROFESSOR MARKUP SAYS All browsers render unknown elements inline, i.e. as if they had a display:inline CSS rule.diveintohtml5.org WHAT DOES IT ALL MEAN?

81.
ere are several new elements deﬁned in HTML5 whi are blo-level elements. at is,they can contain other blo-level elements, and HTML5-compliant browsers will style themas display:block by default. If you want to use these elements in older browsers, you willneed to deﬁne the display style manually: article,aside,details,figcaption,figure, footer,header,hgroup,menu,nav,section { display:block; }(is code is lied from Ri Clark’s HTML5 Reset Stylesheet, whi does many other thingsthat are beyond the scope of this apter.)But wait, it gets worse! Prior to version 9, Internet Explorer did not apply any styling onunknown elements. For example, if you had this markup: <style type="text/css"> article { display: block; border: 1px solid red } </style> ... <article> <h1>Welcome to Initech</h1> <p>This is your <span>first day</span>.</p> </article>Internet Explorer (up to and including IE 8) will not treat the <article> element as a blo-level element, nor will it put a red border around the article. All the style rules are simplyignored. As I write this, Internet Explorer 9 is still in beta , but Microso has stated (anddiveintohtml5.org WHAT DOES IT ALL MEAN?

82.
developers have veriﬁed) that Internet Explorer 9 will not have this problem.e second problem is the DOM that browsers create when they encounter unknownelements. Again, the most problematic browser is Internet Explorer. If IE doesn’t explicitlyrecognize the element name, it will insert the element into the DOM as an empty node withno ildren. All the elements that you would expect to be direct ildren of the unknownelement will actually be inserted as siblings instead.Here is some righteous ASCII art to illustrate the diﬀerence. is is the DOM that HTML5dictates: article | +--h1 (child of article) | | | +--text node "Welcome to Initech" | +--p (child of article, sibling of h1) | +--text node "This is your " | +--span | | | +--text node "first day" | +--text node "."But this is the DOM that Internet Explorer actually creates: article (no children) h1 (sibling of article) | +--text node "Welcome to Initech" p (sibling of h1) | +--text node "This is your "diveintohtml5.org WHAT DOES IT ALL MEAN?

83.
| +--span | | | +--text node "first day" | +--text node "."ere is a wonderous workaround for this problem. If you create a dummy <article>element with JavaScript before you use it in your page, Internet Explorer will magicallyrecognize the <article> element and let you style it with CSS. ere is no need to everinsert the dummy element into the DOM. Simply creating the element once (per page) isenough to tea IE to style the element it doesn’t recognize. <html> <head> <style> article { display: block; border: 1px solid red } </style> <script>document.createElement("article");</script> </head> <body> <article> <h1>Welcome to Initech</h1> <p>This is your <span>first day</span>.</p> </article> </body> </html>is works in all versions of Internet Explorer, all the way ba to IE 6! We can extend thistenique to create dummy copies of all the new HTML5 elements at once — again, they’renever inserted into the DOM, so you’ll never see these dummy elements — and then just startusing them without having to worry too mu about non-HTML5-capable browsers.Remy Sharp has done just that, with his aptly named HTML5 enabling script. e script hasgone through 14 revisions at the time of writing, but this is the basic idea:diveintohtml5.org WHAT DOES IT ALL MEAN?

84.
<!--[if lt IE 9]> <script> var e = ("abbr,article,aside,audio,canvas,datalist,details," + "figure,footer,header,hgroup,mark,menu,meter,nav,output," + "progress,section,time,video").split(,); for (var i = 0; i < e.length; i++) { document.createElement(e[i]); } </script> <![endif]-->e <!--[if lt IE 9]> and <![endif]--> bits are conditional comments. InternetExplorer interprets them like an if statement: “if the current browser is a version of InternetExplorer less than version 9, then execute this blo.” Every other browser will treat the entireblo as an HTML comment. e net result is that Internet Explorer (up to and includingversion 8) will execute this script, but other browsers will ignore the script altogether. ismakes your page load faster in browsers that don’t need this ha.e JavaScript code itself is relatively straightforward. e variable e ends up as an array ofstrings like "abbr", "article", "aside", and so on. en we loop through this array andcreate ea of the named elements by calling document.createElement(). But since weignore the return value, the elements are never inserted into the DOM. But this is enough toget Internet Explorer to treat these elements the way we want them to be treated, once weactually use them later in the page.at “later” bit is important. is script needs to be at the top of your page, preferably inyour <head> element, not at the boom. at way, Internet Explorer will execute the scriptbefore it parses your tags and aributes. If you put this script at the boom of your page, itwill be too late. Internet Explorer will have already misinterpreted your markup andconstructed the wrong DOM, and it won’t go ba and adjust it just because of this script.Remy Sharp has “miniﬁed” this script and hosted it on Google Project Hosting . (In case youwere wondering, the script itself is open source and MIT-licensed, so you can use it in anyproject.) If you like, you can even “hotlink” the script by pointing directly to the hostedversion, like this:diveintohtml5.org WHAT DOES IT ALL MEAN?

86.
<h2>Im going to Prague!</h2> </div>ere is nothing wrong with this markup. If you like it, you can keep it. It is valid HTML5.But HTML5 provides some additional semantic elements for headers and sections.First oﬀ, let’s get rid of that <div id="header">. is is a common paern, but it doesn’tmean anything. e div element has no deﬁned semantics, and the id aribute has nodeﬁned semantics. (User agents are not allowed to infer any meaning from the value of theid aribute.) You could ange this to <div id="shazbot"> and it would have the samesemantic value, i.e., nothing.HTML5 deﬁnes a <header> element for this purpose. e HTML5 speciﬁcation has real-world examples of using the <header> element. Here is what it would look like on ourexample page: <header> <h1>My Weblog</h1> <p class="tagline">A lot of effort went into making this effortless.</p> … </header>at’s good. It tells anyone who wants to know that this is a header. But what about thattagline? Another common paern, whi up until now had no standard markup. It’s a diﬃcultthing to mark up. A tagline is like a subheading, but it’s “aaed” to the primary heading.at is, it’s a subheading that doesn’t create its own section.Header elements like <h1> and <h2> give your page structure. Taken together, they create anoutline that you can use to visualize (or navigate) your page. Screenreaders use documentoutlines to help blind users navigate through your page. ere are online tools and browserextensions that can help you visualize your document’s outline.In HTML 4, <h1>–<h6> elements were the only way to create a document outline. eoutline on the example page looks like this:diveintohtml5.org WHAT DOES IT ALL MEAN?

87.
My Weblog (h1) | +--Travel day (h2) | +--Im going to Prague! (h2)at’s ﬁne, but it means that there’s no way to mark up the tagline “A lot of eﬀort went intomaking this eﬀortless.” If we tried to mark it up as an <h2>, it would add a phantom node tothe document outline: My Weblog (h1) | +--A lot of effort went into making this effortless. (h2) | +--Travel day (h2) | +--Im going to Prague! (h2)But that’s not the structure of the document. e tagline does not represent a section; it’s justa subheading.Perhaps we could mark up the tagline as an <h2> and mark up ea article title as an <h3>?No, that’s even worse: My Weblog (h1) | +--A lot of effort went into making this effortless. (h2) | +--Travel day (h3) | +--Im going to Prague! (h3)Now we still have a phantom node in our document outline, but it has “stolen” the ildrenthat rightfully belong to the root node. And herein lies the problem: HTML 4 does notprovide a way to mark up a subheading without adding it to the document outline. No maerhow we try to shi things around, “A lot of eﬀort went into making this eﬀortless” is going todiveintohtml5.org WHAT DOES IT ALL MEAN?

88.
end up in that graph. And that’s why we ended up with semantically meaningless markup like<p class="tagline">.HTML5 provides a solution for this: the <hgroup> element. e <hgroup> element acts as awrapper for two or more related heading elements. What does “related” mean? It means that,taken together, they only create a single node in the document outline.Given this markup: <header> <hgroup> <h1>My Weblog</h1> <h2>A lot of effort went into making this effortless.</h2> </hgroup> … </header> … <div class="entry"> <h2>Travel day</h2> </div> … <div class="entry"> <h2>Im going to Prague!</h2> </div>is is the document outline that is created: My Weblog (h1 of its hgroup) | +--Travel day (h2) | +--Im going to Prague! (h2)diveintohtml5.org WHAT DOES IT ALL MEAN?

89.
You can test your own pages in the HTML5 Outliner to ensure that you’re using the headingelements properly. ❧ ARTICLESContinuing with our example page, let’s see what we can do about this markup: <div class="entry"> <p class="post-date">October 22, 2009</p> <h2> <a href="#" rel="bookmark" title="link to this post"> Travel day </a> </h2> … </div>Again, this is valid HTML5. But HTML5 provides a more speciﬁc element for the commoncase of marking up an article on a page — the aptly named <article> element. <article> <p class="post-date">October 22, 2009</p> <h2> <a href="#" rel="bookmark" title="link to this post"> Travel day </a> </h2>diveintohtml5.org WHAT DOES IT ALL MEAN?

90.
… </article>Ah, but it’s not quite that simple. ere is one more ange you should make. I’ll show it toyou ﬁrst, then explain it: <article> <header> <p class="post-date">October 22, 2009</p> <h1> <a href="#" rel="bookmark" title="link to this post"> Travel day </a> </h1> </header> … </article>Did you cat that? I anged the <h2> element to an <h1>, and wrapped it inside a<header> element. You’ve already seen the <header> element in action. Its purpose is towrap all the elements that form the article’s header (in this case, the article’s publication dateand title). But…but…but… shouldn’t you only have one <h1> per document? Won’t this screwup the document outline? No, but to understand why not, we need to ba up a step.In HTML 4, the only way to create a document outline was with the <h1>–<h6> elements. Ifyou only wanted one root node in your outline, you had to limit yourself to one <h1> inyour markup. But the HTML5 speciﬁcation deﬁnes an algorithm for generating a documentoutline that incorporates the new semantic elements in HTML5. e HTML5 algorithm saysthat an <article> element creates a new section, that is, a new node in the documentoutline. And in HTML5, ea section can have its own <h1> element.is is a drastic ange from HTML 4, and here’s why it’s a good thing. Many web pages arereally generated by templates. A bit of content is taken from one source and inserted into thepage up here; a bit of content is taken from another source and inserted into the page downdiveintohtml5.org WHAT DOES IT ALL MEAN?

91.
there. Many tutorials are structured the same way. “Here’s some HTML markup. Just copy itand paste it into your page.” at’s ﬁne for small bits of content, but what if the markupyou’re pasting is an entire section? In that case, the tutorial will read something like this:“Here’s some HTML markup. Just copy it, paste it into a text editor, and ﬁx the heading tagsso they mat the nesting level of the corresponding heading tags in the page you’re pastingit into.”Let me put it another way. HTML 4 has no generic heading element. It has six strictlynumbered heading elements, <h1>–<h6>, whi must be nested in exactly that order. atkind of sus, especially if your page is “assembled” instead of “authored.” And this is theproblem that HTML5 solves with the new sectioning elements and the new rules for theexisting heading elements. If you’re using the new sectioning elements, I can give you thismarkup: <article> <header> <h1>A syndicated post</h1> </header> <p>Lorem ipsum blah blah…</p> </article>and you can copy it and paste it anywhere in your page without modiﬁcation. e fact that itcontains an <h1> element is not a problem, because the entire thing is contained within an<article> . e <article> element deﬁnes a self-contained node in the document outline,the <h1> element provides the title for that outline node, and all the other sectioningelements on the page will remain at whatever nesting level they were at before.PROFESSOR MARKUP SAYS As with all things on the web, reality is a lile more complicated than I’m leing on. e new “explicit” sectioning elements (like <h1> wrapped in <article>) may interact in unexpected ways with the old “implicit” sectioning elements (<h1>–<h6> by themselves). Your life will be simpler if you use one or the other, but not both. If you must use both on the same page, be sure to e thediveintohtml5.org WHAT DOES IT ALL MEAN?

92.
result in the HTML5 Outliner and verify that your document outline makes sense. ❧ DATES AND TIMES is is exciting, right? I mean, it’s not “skiing down Mount Everest naked while reciting the Star Spangled Banner bawards” exciting, but it’s prey exciting as far as semantic markup goes. Let’s continue with our example page. e next line I want to highlight is this one: <div class="entry"> <p class="post-date">October 22, 2009</p> <h2>Travel day</h2> </div> Same old story, right? A common paern — designating the publication date of an article — that has no semantic markup to ba it up, so authors resort to generic markup with customclass aributes. Again, this is valid HTML5. You’re not required to ange it. But HTML5does provide a speciﬁc solution for this case: the <time> element. <time datetime="2009-10-22" pubdate>October 22, 2009</time>ere are three parts to a <time> element: 1. A maine-readable timestamp 2. Human-readable text content 3. An optional pubdate ﬂagdiveintohtml5.org WHAT DOES IT ALL MEAN?

93.
In this example, the datetime aribute only speciﬁes a date, not a time. e format is afour-digit year, two-digit month, and two-digit day, separated by dashes: <time datetime="2009-10-22" pubdate>October 22, 2009</time>If you want to include a time too, add the leer T aer the date, then the time in 24-hourformat, then a timezone oﬀset. <time datetime="2009-10-22T13:59:47-04:00" pubdate> October 22, 2009 1:59pm EDT </time>(e date/time format is prey ﬂexible. e HTML5 speciﬁcation contains examples of validdate/time strings.)Notice I anged the text content — the stuﬀ between <time> and </time> — to mat themaine-readable timestamp. is is not actually required. e text content can be anythingyou like, as long as you provide a maine-readable date/timestamp in the datetimearibute. So this is valid HTML5: <time datetime="2009-10-22">last Thursday</time>And this is also valid HTML5: <time datetime="2009-10-22"></time>e ﬁnal piece of the puzzle here is the pubdate aribute. It’s a Boolean aribute, so justadd it if you need it, like this: <time datetime="2009-10-22" pubdate>October 22, 2009</time>If you dislike “naked” aributes, this is also equivalent: <time datetime="2009-10-22" pubdate="pubdate">October 22, 2009</time>diveintohtml5.org WHAT DOES IT ALL MEAN?

94.
What does the pubdate aribute mean? It means one of two things. If the <time> elementis in an <article> element, it means that this timestamp is the publication date of thearticle. If the <time> element is not in an <article> element, it means that this timestampis the publication date of the entire document.Here’s the entire article, reformulated to take full advantage of HTML5: <article> <header> <time datetime="2009-10-22" pubdate> October 22, 2009 </time> <h1> <a href="#" rel="bookmark" title="link to this post"> Travel day </a> </h1> </header> <p>Lorem ipsum dolor sit amet…</p> </article> ❧ NAVIGATIONOne of the most important parts of anyweb site is the navigation bar. CNN.comhas “tabs” along the top of ea page thatlink to the diﬀerent news sections — “Te,”“Health,” “Sports,” &c. Google sear resultspages have a similar strip at the top of thediveintohtml5.org WHAT DOES IT ALL MEAN?

95.
page to try your sear in diﬀerent Googleservices — “Images,” “Video,” “Maps,” &c.And our example page has a navigation barin the header that includes links to diﬀerentsections of our hypothetical site — “home,”“blog,” “gallery,” and “about.”is is how the navigation bar wasoriginally marked up: <div id="nav"> <ul> <li><a href="#">home</a></li> <li><a href="#">blog</a></li> <li><a href="#">gallery</a></li> <li><a href="#">about</a></li> </ul> </div>Again, this is valid HTML5. But while it’s marked up as a list of four items, there is nothingabout the list that tells you that it’s part of the site navigation. Visually, you could guess thatby the fact that it’s part of the page header, and by reading the text of the links. Butsemantically, there is nothing to distinguish this list of links from any other.Who cares about the semantics of site navigation? For one, people with disabilities. Why isthat? Consider this scenario: your motion is limited, and using a mouse is diﬃcult orimpossible. To compensate, you might use a browser add-on that allows you to jump to (orjump past) major navigation links. Or consider this: if your sight is limited, you might use adedicated program called a “screenreader” that uses text-to-spee to speak and summarizeweb pages. Once you get past the page title, the next important pieces of information about apage are the major navigation links. If you want to navigate quily, you’ll tell yourscreenreader to jump to the navigation bar and start reading. If you want to browse quily,you might tell your screenreader to jump over the navigation bar and start reading the maincontent. Either way, being able to determine navigation links programmatically is important.So, while there’s nothing wrong with using <div id="nav"> to mark up your sitediveintohtml5.org WHAT DOES IT ALL MEAN?

96.
navigation, there’s nothing particularly right about it either. It’s suboptimal in ways that aﬀectreal people. HTML5 provides a semantic way to mark up navigation sections: the <nav>element. <nav> <ul> <li><a href="#">home</a></li> <li><a href="#">blog</a></li> <li><a href="#">gallery</a></li> <li><a href="#">about</a></li> </ul> </nav>ASK PROFESSOR MARKUP ☞ Q: Are skip links compatible with the <nav> element? Do I still need skip links in HTML5? A: Skip links allow readers to skip over navigation sections. ey are helpful for disabled users who use third-party soware to read a web page aloud and navigate it without a mouse. (Learn how and why to provide skip links.) Once screenreaders are updated to recognize the <nav> element, skip links will become obsolete, since the screenreader soware will be able to automatically oﬀer to skip over a navigation section marked up with the <nav> element. However, it will be a while before all the disabled users on the web upgrade to HTML5-savvy screenreader soware, so you should continue to provide your own skip links to jump over <nav> sections.diveintohtml5.org WHAT DOES IT ALL MEAN?

97.
❧ FOOTERSAt long last, we have arrived at the end of our example page. e last thing I want to talkabout is the last thing on the page: the footer. e footer was originally marked up like this: <div id="footer"> <p>&#167;</p> <p>&#169; 2001&#8211;9 <a href="#">Mark Pilgrim</a></p> </div>is is valid HTML5. If you like it, you can keep it. But HTML5 provides a more speciﬁcelement for this: the <footer> element. <footer> <p>&#167;</p> <p>&#169; 2001&#8211;9 <a href="#">Mark Pilgrim</a></p> </footer>What’s appropriate to put in a <footer> element? Probably whatever you’re puing in a<div id="footer"> now. OK, that’s a circular answer. But really, that’s it. e HTML5speciﬁcation says, “A footer typically contains information about its section su as who wroteit, links to related documents, copyright data, and the like.” at’s what’s in this example page:a short copyright statement and a link to an about-the-author page. Looking around at somepopular sites, I see lots of footer potential. CNN has a footer that contains a copyright statement, links to translations, and links to terms of service, privacy, “about us,” “contact us,” and “help” pages. All totallydiveintohtml5.org WHAT DOES IT ALL MEAN?

98.
appropriate <footer> material. Google has a famously sparse home page, but at the boom of it are links to “Advertising Programs,” “Business Solutions,” and “About Google”; a copyright statement; and a link to Google’s privacy policy. All of that could be wrapped in a <footer>. My weblog has a footer with links to my other sites, plus a copyright statement. Deﬁnitely appropriate for a <footer> element. (Note that the links themselves should not be wrapped in a <nav> element, because they are not site navigation links; they are just a collection of links to my other projects on other sites.)“Fat footers” are all the rage these days. Take a look at the footer on the W3C site. It containsthree columns, labeled “Navigation,” “Contact W3C,” and “W3C Updates.” e markup lookslike this, more or less: <div id="w3c_footer"> <div class="w3c_footer-nav"> <h3>Navigation</h3> <ul> <li><a href="/">Home</a></li> <li><a href="/standards/">Standards</a></li> <li><a href="/participate/">Participate</a></li> <li><a href="/Consortium/membership">Membership</a></li> <li><a href="/Consortium/">About W3C</a></li> </ul> </div> <div class="w3c_footer-nav"> <h3>Contact W3C</h3> <ul> <li><a href="/Consortium/contact">Contact</a></li> <li><a href="/Help/">Help and FAQ</a></li> <li><a href="/Consortium/sup">Donate</a></li> <li><a href="/Consortium/siteindex">Site Map</a></li> </ul> </div> <div class="w3c_footer-nav"> <h3>W3C Updates</h3> <ul>diveintohtml5.org WHAT DOES IT ALL MEAN?

101.
On standards modes and doctype sniﬃng: Activating Browser Modes with Doctype by Henri Sivonen. is is the only article you should read on the subject. Any article on doctypes that doesn’t reference Henri’s work is guaranteed to be out of date, incomplete, or wrong.HTML5-aware validator: html5.validator.nu ❧is has been “What Does It All Mean?” e full table of contents has more if you’d like tokeep reading.DID YOU KNOW? In association with Google Press, O’Reilly is distributing this book in a variety of formats, including paper, ePub, Mobi, and DRM-free PDF. e paid edition is called “HTML5: Up & Running,” and it is available now. is apter is included in the paid edition. If you liked this apter and want to show your appreciation, you can buy “HTML5: Up & Running” with this aﬃliate link or buy an electronic edition directly from O’Reilly. You’ll get a book, and I’ll get a bu. I do not currently accept direct donations. Copyright MMIX–MMX Mark Pilgrimdiveintohtml5.org WHAT DOES IT ALL MEAN?

102.
powered by Google™ Searchdiveintohtml5.org WHAT DOES IT ALL MEAN?

103.
You are here: Home ‣ Dive Into HTML5 ‣ №4 . LET’S CALL IT A DRAW(ING SURFACE) show table of contents ❧ DIVING IN TML 5 deﬁnes the <canvas> element as “a resolution-dependent bitmap canvas whi can be used for rendering graphs, game graphics, or other visual images on the ﬂy.” A canvas is a rectangle in your page where you can use JavaScript to draw anything you want. BASIC <CANVAS> SUPPORT IE* FIREFOX SAFARI CHROME OPERA IPHONE ANDROID 7.0+ 3.0+ 3.0+ 3.0+ 10.0+ 1.0+ 1.0+ * Internet Explorer support requires the third-party explorercanvas library.So what does a canvas look like? Nothing, really. A <canvas> element has no content andno border of its own.diveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

104.
↜ Invisible canvase markup looks like this: <canvas width="300" height="225"></canvas>Let’s add a doed border so we can see what we’re dealing with. ↜ Canvas with borderYou can have more than one <canvas> element on the same page. Ea canvas will show upin the DOM, and ea canvas maintains its own state. If you give ea canvas an id aribute,you can access them just like any other element.Let’s expand that markup to include an id aribute: <canvas id="a" width="300" height="225"></canvas>Now you can easily ﬁnd that <canvas> element in the DOM. var a_canvas = document.getElementById("a"); ❧ SIMPLE SHAPESdiveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

106.
Every canvas has a drawing context, whi iswhere all the fun stuﬀ happens. Once you’vefound a <canvas> element in the DOM (byusing document.getElementById() or anyother method you like), you call itsgetContext() method. You must pass thestring "2d" to the getContext() method. ☞ Q: Is there a 3-D canvas? A: Not yet. Individual vendors have experimented with their own three- dimensional canvas APIs, but none of them have been standardized. e HTML5 speciﬁcation notes, “A future version of this speciﬁcation will probably deﬁne a 3d context.”So, you have a <canvas> element, and you have its drawing context. e drawing context iswhere all the drawing methods and properties are deﬁned. ere’s a whole group of propertiesand methods devoted to drawing rectangles: e fillStyle property can be a CSS color, a paern, or a gradient. (More on gradients shortly.) e default fillStyle is solid bla, but you can set it to whatever you like. Ea drawing context remembers its own properties as long as the page is open, unless you do something to reset it. fillRect(x, y, width, height) draws a rectangle ﬁlled with the current ﬁll style. e strokeStyle property is like fillStyle — it can be a CSS color, a paern, or a gradient. strokeRect(x, y, width, height) draws an rectangle with the current stroke style. strokeRect doesn’t ﬁll in the middle; it just draws the edges. clearRect(x, y, width, height) clears the pixels in the speciﬁed rectangle.ASK PROFESSOR MARKUP ☞diveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

107.
☞ Q: Can I “reset” a canvas? A: Yes. Seing the width or height of a <canvas> element will erase its contents and reset all the properties of its drawing context to their default values. You don’t even need to ange the width; you can simply set it to its current value, like this: var b_canvas = document.getElementById("b"); b_canvas.width = b_canvas.width;Geing ba to that code sample in the previous example… var b_canvas = document.getElementById("b"); Draw a rectangle ⇝ var b_context = b_canvas.getContext("2d"); b_context.fillRect(50, 25, 150, 100);Calling the fillRect() method draws the rectangle and ﬁlls it with the current ﬁll style,whi is bla until you ange it. e rectangle is bounded by its upper-le corner (50, 25),its width (150), and its height (100). To get a beer picture of how that works, let’s look atthe canvas coordinate system. ❧ CANVAS COORDINATESe canvas is a two-dimensional grid. e coordinate (0, 0) is at the upper-le corner of thediveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

108.
canvas. Along the X-axis, values increase towards the right edge of the canvas. Along the Y-axis, values increase towards the boom edge of the canvas. Canvas coordinates diagram ↷at coordinate diagram was drawn with a <canvas> element. It comprises a set of oﬀ-white vertical lines a set of oﬀ-white horizontal lines two bla horizontal lines two small bla diagonal lines that form an arrow two bla vertical lines two small bla diagonal lines that form another arrow the leer “x” the leer “y” the text “(0, 0)” near the upper-le corner the text “(500, 375)” near the lower-right corner a dot in the upper-le corner, and another in the lower-right cornerFirst, we need to deﬁne the <canvas> element itself. e <canvas> element deﬁnes thewidth and height, and the id so we can ﬁnd it later. <canvas id="c" width="500" height="375"></canvas>diveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

109.
en we need a script to ﬁnd the <canvas> element in the DOM and get its drawingcontext. var c_canvas = document.getElementById("c"); var context = c_canvas.getContext("2d");Now we can start drawing lines. ❧ PATHS IE* FIREFOX SAFARI CHROME OPERA IPHONE ANDROID 7.0+ 3.0+ 3.0+ 3.0+ 10.0+ 1.0+ 1.0+ * Internet Explorer support requires the third-party explorercanvas library. Imagine you’re drawing a picture in ink. You don’t want to just dive in and start drawing with ink, because you might make a mistake. Instead, you sket the lines and curves with a pencil, and once you’re happy with it, you trace over your sket in ink. Ea canvas has a path. Deﬁning the path is like drawing with a pencil. You can draw whatever you like, but it won’t be part of the ﬁnished product until you pi up the quill and trace over your path in ink. To draw straight lines in pencil, you use the following two methods: 1. moveTo(x, y) moves the pencil to the speciﬁed starting point. 2. lineTo(x, y) draws a line to the speciﬁed ending point.e more you call moveTo() and lineTo(), the bigger the path gets. ese are “pencil”methods — you can call them as oen as you like, but you won’t see anything on the canvasdiveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

110.
until you call one of the “ink” methods.Let’s begin by drawing the oﬀ-white grid. for (var x = 0.5; x < 500; x += 10) { context.moveTo(x, 0); context.lineTo(x, 375); } ⇜ Draw vertical lines for (var y = 0.5; y < 375; y += 10) { ⇜ Draw horizontal context.moveTo(0, y); context.lineTo(500, y); } linesose were all “pencil” methods. Nothing has actually been drawn on the canvas yet. We needan “ink” method to make it permanent. context.strokeStyle = "#eee"; context.stroke();stroke() is one of the “ink” methods. It takes the complex path you deﬁned with all thosemoveTo() and lineTo() calls, and actually draws it on the canvas. e strokeStylecontrols the color of the lines. is is the result:diveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

111.
ASK PROFESSOR MARKUP ☞ Q: Why did you start x and y at 0.5? Why not 0? A: Imagine ea pixel as a large square. e whole-number coordinates (0, 1, 2…) are the edges of the squares. If you draw a one-unit- wide line between whole-number coordinates, it will overlap opposite sides of the pixel square, and the resulting line will be drawn two pixels wide. To draw a line that is only one pixel wide, you need to shi the coordinates by 0.5 perpendicular to the lines direction. For example, if you try to draw a line from (1, 0) to (1, 3) , the browser will draw a line covering 0.5 screen pixels on either side of x=1. e screen can’t display half a pixel, so it expands the line to cover a total of two pixels:diveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

112.
But, if you try to draw a line from (1.5, 0) to (1.5, 3), the browser will draw a line covering 0.5 screen pixels on either side of x=1.5, whi results in a true 1-pixel-wide line: anks to Jason Johnson for providing these diagrams.diveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

113.
Now let’s draw the horizontal arrow. All the lines and curves on a path are drawn in thesame color (or gradient — yes, we’ll get to those soon). We want to draw the arrow in adiﬀerent color ink — bla instead of oﬀ-white — so we need to start a new path. A new path ↷ context.beginPath(); context.moveTo(0, 40); context.lineTo(240, 40); context.moveTo(260, 40); context.lineTo(500, 40); context.moveTo(495, 35); context.lineTo(500, 40); context.lineTo(495, 45);e vertical arrow looks mu the same. Since the vertical arrow is the same color as thehorizontal arrow, we do not need to start another new path. e two arrows will be part ofthe same path. context.moveTo(60, 0); context.lineTo(60, 153); context.moveTo(60, 173); context.lineTo(60, 375); ↜ Not a new path context.moveTo(65, 370); context.lineTo(60, 375); context.lineTo(55, 370);I said these arrows were going to be bla, but the strokeStyle is still oﬀ-white. (efillStyle and strokeStyle don’t get reset when you start a new path.) at’s OK,because we’ve just run a series of “pencil” methods. But before we draw it for real, in “ink,”we need to set the strokeStyle to bla. Otherwise, these two arrows will be oﬀ-white, andwe’ll hardly be able to see them! e following lines ange the color to bla and draw thelines on the canvas: context.strokeStyle = "#000"; context.stroke();diveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

114.
is is the result: ❧ TEXT IE* FIREFOX † SAFARI CHROME OPERA IPHONE ANDROID 7.0+ 3.0+ 3.0+ 3.0+ 10.0+ 1.0+ 1.0+ * Internet Explorer support requires the third-party explorercanvas library. † Mozilla Firefox 3.0 support requires a compatibility shim.In addition to drawing lines on a canvas , you can also draw text on a canvas. Unlike text onthe surrounding web page, there is no box model. at means none of the familiar CSS layoutteniques are available: no ﬂoats, no margins, no padding, no word wrapping. (Maybe youthink that’s a good thing!) You can set a few font aributes, then you pi a point on thecanvas and draw your text there.e following font aributes are available on the drawing context: font can be anything you would put in a CSS font rule. at includes font style, fontdiveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

115.
variant, font weight, font size, line height, and font family. textAlign controls text alignment. It is similar (but not identical) to a CSS text- align rule. Possible values are start , end, left, right , and center. textBaseline controls where the text is drawn relative to the starting point. Possible values are top, hanging, middle, alphabetic, ideographic, or bottom.textBaseline is triy, because text is triy (English text isn’t, but you can draw anyUnicode aracter you like on a canvas, and Unicode is triy). e HTML5 speciﬁcationexplains the diﬀerent text baselines : e top of the em square is roughly at the top of the glyphs in a font, the hanging baseline is where some glyphs like आ are anored, the middle is half-way between the top of the em square and the boom of the em square, the alphabetic baseline is where aracters like Á, ÿ, f, and Ω are anored, the ideographic baseline is where glyphs like 私 and 達 are anored, and the boom of the em square is roughly at the boom of the glyphs in a font. e top and boom of the bounding box can be far from these baselines, due to glyphs extending far outside the em square.For simple alphabets like English, you can safely sti with top, middle, or bottom for thetextBaseline property.Let’s draw some text! Text drawn inside the canvas inherits the font size and style of the<canvas> element itself, but you can override this by seing the font property on thedrawing context. context.font = "bold 12px sans-serif"; context.fillText("x", 248, 43); ↜ Change the fontdiveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

116.
context.fillText("x", 248, 43); context.fillText("y", 58, 165); stylee fillText() method draws the actual text. context.font = "bold 12px sans-serif"; context.fillText("x", 248, 43); ⇜ Draw the text context.fillText("y", 58, 165);ASK PROFESSOR MARKUP ☞ Q: Can I use relative font sizes to draw text on a canvas? A: Yes. Like every other HTML element on your page, the <canvas> element itself has a computed font size based on your page’s CSS rules. If you set the context.font property to a relative font size like 1.5em or 150%, your browser multiplies this by the computed font size of the <canvas> element itself.For the text in the upper-le corner, let’s say I want the top of the text to be at y=5. But I’mlazy — I don’t want to measure the height of the text and calculate the baseline. Instead, I canset textBaseline to top and pass in the upper-le coordinate of the text’s bounding box. context.textBaseline = "top"; context.fillText("( 0 , 0 )", 8, 5);Now for the text in the lower-right corner. Let’s say I want the boom-right corner of thetext to be at coordinates (492,370) — just a few pixels away from the boom-right cornerof the canvas — but I don’t want to measure the width or height of the text. I can settextAlign to right and textBaseline to bottom, then call fillText() with theboom-right coordinates of the text’s bounding box.diveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

117.
context.textAlign = "right"; context.textBaseline = "bottom"; context.fillText("( 500 , 375 )", 492, 370);And this is the result:Oops! We forgot the dots in the corners. We’ll see how to draw circles a lile later. For now,I’ll eat a lile and draw them as rectangles. context.fillRect(0, 0, 3, 3); context.fillRect(497, 372, 3, 3); ⇜ Draw two “dots”And that’s all she wrote! Here is the ﬁnal product:diveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

119.
e markup looks the same as any other canvas. <canvas id="d" width="300" height="225"></canvas>First, we need to ﬁnd the <canvas> element and its drawing context. var d_canvas = document.getElementById("d"); var context = d_canvas.getContext("2d");Once we have the drawing context, we can start to deﬁne a gradient. A gradient is a smoothtransition between two or more colors. e canvas drawing context supports two types ofgradients: 1. createLinearGradient(x0, y0, x1, y1) paints along a line from (x0, y0) to (x1, y1). 2. createRadialGradient(x0, y0, r0, x1, y1, r1) paints along a cone between two circles. e ﬁrst three parameters represent the start circle, with origin (x0, y0) and radius r0. e last three parameters represent the end circle, with origin (x1, y1) and radius r1.Let’s make a linear gradient. Gradients can be any size, but I’ll make this gradient be 300pixels wide, like the canvas. Create a gradient object ↷ var my_gradient = context.createLinearGradient(0, 0, 300, 0);diveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

120.
Because the y values (the 2ND and 4T H parameters) are both 0, this gradient will shade evenlyfrom le to right.Once we have a gradient object, we can deﬁne the gradient’s colors. A gradient has two ormore color stops. Color stops can be anywhere along the gradient. To add a color stop, youneed to specify its position along the gradient. Gradient positions can be anywhere between 0to 1.Let’s deﬁne a gradient that shades from bla to white. my_gradient.addColorStop(0, "black"); my_gradient.addColorStop(1, "white");Deﬁning a gradient doesn’t draw anything on the canvas. It’s just an object tued away inmemory somewhere. To draw a gradient, you set your fillStyle to the gradient and drawa shape, like a rectangle or a line. Fill style is a gradient ↷ context.fillStyle = my_gradient; context.fillRect(0, 0, 300, 225);And this is the result:Suppose you want a gradient that shades from top to boom. When you create the gradientobject, keep the x values (1ST and 3RD parameters) constant, and make the y values (2ND and4T H parameters) range from 0 to the height of the canvas.diveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

123.
drawImage(image, dx, dy) takes an image and draws it on the canvas. e given coordinates (dx, dy) will be the upper-le corner of the image. Coordinates (0, 0) would draw the image at the upper-le corner of the canvas. drawImage(image, dx, dy, dw, dh) takes an image, scales it to a width of dw and a height of dh, and draws it on the canvas at coordinates (dx, dy). drawImage(image, sx, sy, sw, sh, dx, dy, dw, dh) takes an image, clips it to the rectangle (sx, sy, sw, sh), scales it to dimensions (dw, dh), and draws it on the canvas at coordinates (dx, dy).e HTML5 speciﬁcation explains the drawImage() parameters: e source rectangle is the rectangle [within the source image] whose corners are the four points (sx, sy), (sx+sw, sy), (sx+sw, sy+sh), (sx, sy+sh). e destination rectangle is the rectangle [within the canvas] whose corners are the four points (dx, dy), (dx+dw, dy), (dx+dw, dy+dh), (dx, dy+dh).To draw an image on a canvas, you need an image. e image can be an existing <img>element, or you can create an Image() object with JavaScript. Either way, you need toensure that the image is fully loaded before you can draw it on the canvas.If you’re using an existing <img> element, you can safely draw it on the canvas during thewindow.onload event.diveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

125.
Here is the script that produces the “multicat” eﬀect: cat.onload = function() { for (var x = 0, y = 0; x < 500 && y < 375; x += 50, y += 37) { context.drawImage(cat, x, y, 88, 56); ⇜ Scale the } image };All this eﬀort raises a legitimate question: why would you want to draw an image on acanvas in the ﬁrst place? What does the extra complexity of image-on-a-canvas buy you overan <img> element and some CSS rules? Even the “multicat” eﬀect could be replicated with 10overlapping <img> elements.e simple answer is, for the same reason you might want to draw text on a canvas . ecanvas coordinates diagram included text, lines, and shapes; the text-on-a-canvas was just onepart of a larger work. A more complex diagram could easily use drawImage() to includeicons, sprites, or other graphics. ❧diveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

126.
WHAT ABOUT IE?Microso Internet Explorer (up to and including version 8, the current version at time ofwriting) does not support the canvas API. However, Internet Explorer does support aMicroso-proprietary tenology called VML, whi can do many of the same things as the<canvas> element. And thus, excanvas.js was born.Explorercanvas (excanvas.js) is an open source, Apae-licensed JavaScript library thatimplements the canvas API in Internet Explorer. To use it, include the following <script>element at the top of your page. <!DOCTYPE html> <html> <head> <meta charset="utf-8"> <title>Dive Into HTML5</title> <!--[if IE]> <script src="excanvas.js"></script> <![endif]--> </head> <body> ... </body> </html>e <!--[if IE]> and <![endif]--> bits are conditional comments. Internet Explorerinterprets them like an if statement: “if the current browser is any version of InternetExplorer, then execute this blo.” Every other browser will treat the entire blo as an HTMLcomment. e net result is that Internet Explorer will download the excanvas.js script andexecute it, but other browsers will ignore the script altogether (not download it, not execute it,not anything). is makes your page load faster in browsers that implement the canvas APInatively.Once you include the excanvas.js in the <head> of your page, you don’t need to doanything else to accomodate Internet Explorer. Just include <canvas> elements in yourdiveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

127.
markup, or create them dynamically with JavaScript. Follow the instructions in this apter toget the drawing context of a <canvas> element, and you can draw shapes, text, and paerns.Well… not quite. ere are a few limitations: 1. Gradients can only be linear. Radial gradients are not supported. 2. Paerns must be repeating in both directions. 3. Clipping regions are not supported. 4. Non-uniform scaling does not correctly scale strokes. 5. It’s slow. is should not come as a raging sho to anyone, since Internet Explorers JavaScript parser is slower than other browsers to begin with. Once you start drawing complex shapes via a JavaScript library that translates commands to a completely diﬀerent tenology, things are going to get bogged down. You won’t notice the performance degradation in simple examples like drawing a few lines and transforming an image, but you’ll see it right away once you start doing canvas-based animation and other crazy stuﬀ.ere is one more caveat about using excanvas.js, and it’s a problem that I ran into whilecreating the examples in this apter. ExplorerCanvas initializes its own faux-canvas interfaceautomatically whenever you include the excanvas.js script in your HTML page. But thatdoesn’t mean that Internet Explorer is ready to use it immediately. In certain situations, youcan run into a race condition where the faux-canvas interface is almost, but not quite, readyto use. e primary symptom of this state is that Internet Explorer will complain that“object doesn’t support this property or method” whenever you try to doanything with a <canvas> element, su as get its drawing context.e easiest solution to this is to defer all of your canvas-related manipulation until aer theonload event ﬁres. is may be a while — if your page has a lot of images or videos, theywill delay the onload event — but it will give ExplorerCanvas time to work its magic. ❧ A COMPLETE, LIVE EXAMPLEdiveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

128.
Halma is a centuries-old board game. Many variations exist. In this example, I’ve created asolitaire version of Halma with 9 pieces on a 9 × 9 board. In the beginning of the game, thepieces form a 3 × 3 square in the boom-le corner of the board. e object of the game isto move all the pieces so they form a 3 × 3 square in the upper-right corner of the board, inthe least number of moves.ere are two types of legal moves in Halma: Take a piece and move it to any adjacent empty square. An “empty” square is one that does not currently have a piece in it. An “adjacent” square is immediately north, south, east, west, northwest, northeast, southwest, or southeast of the piece’s current position. (e board does not wrap around from one side to the other. If a piece is in the le- most column, it can not move west, northwest, or southwest. If a piece is in the boom- most row, it can not move south, southeast, or southwest.) Take a piece and hop over an adjacent piece, and possibly repeat. at is, if you hop over an adjacent piece, then hop over another piece adjacent to your new position, that counts as a single move. In fact, any number of hops still counts as a single move. (Since the goal is to minimize the total number of moves, doing well in Halma involves constructing, and then using, long ains of staggered pieces so that other pieces can hop over them in long sequences.)Here is the game itself. You can also play it on a separate page if you want to poke at itwith your browser’s developer tools.diveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

129.
Moves: 0How does it work? I’m so glad you asked. I won’t show all the code here. (You can see it atdiveintohtml5.org/examples/halma.js.) I’ll skip over most of the gameplay code itself, but Iwant to highlight a few parts of the code that deal with actually drawing on the canvas andresponding to mouse clis on the canvas element.During page load, we initialize the game by seing the dimensions of the <canvas> itselfand storing a reference to its drawing context. gCanvasElement.width = kPixelWidth; gCanvasElement.height = kPixelHeight; gDrawingContext = gCanvasElement.getContext("2d");en we do something you haven’t seen yet: we add an event listener to the <canvas>element to listen for cli events. gCanvasElement.addEventListener("click", halmaOnClick, false);e halmaOnClick() function gets called when the user clis anywhere within the canvas.diveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

130.
Its argument is a MouseEvent object that contains information about where the user clied. function halmaOnClick(e) { var cell = getCursorPosition(e); // the rest of this is just gameplay logic for (var i = 0; i < gNumPieces; i++) { if ((gPieces[i].row == cell.row) && (gPieces[i].column == cell.column)) { clickOnPiece(i); return; } } clickOnEmptyCell(cell); }e next step is to take the MouseEvent object and calculate whi square on the Halmaboard just got clied. e Halma board takes up the entire canvas, so every cli issomewhere on the board. We just need to ﬁgure out where. is is triy, because mouseevents are implemented diﬀerently in just about every browser. function getCursorPosition(e) { var x; var y; if (e.pageX != undefined && e.pageY != undefined) { x = e.pageX; y = e.pageY; } else { x = e.clientX + document.body.scrollLeft + document.documentElement.scrollLeft; y = e.clientY + document.body.scrollTop + document.documentElement.scrollTop; }At this point, we have x and y coordinates that are relative to the document (that is, the entireHTML page). at’s not quite useful yet. We want coordinates relative to the canvas.diveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

131.
x -= gCanvasElement.offsetLeft; y -= gCanvasElement.offsetTop;Now we have x and y coordinates that are relative to the canvas. at is, if x is 0 and y is 0at this point, we know that the user just clied the top-le pixel of the canvas.From here, we can calculate whi Halma square the user clied, and then act accordingly. var cell = new Cell(Math.floor(y/kPieceHeight), Math.floor(x/kPieceWidth)); return cell; }Whew! Mouse events are tough. But you can use the same logic (in fact, this exact code) inall of your own canvas-based applications. Remember: mouse cli → document-relativecoordinates → canvas-relative coordinates → application-speciﬁc code.OK, let’s look at the main drawing routine. Because the graphics are so simple, I’ve osen toclear and redraw the board in its entirety every time anything anges within the game. isis not strictly necessary. e canvas drawing context will retain whatever you have previouslydrawn on it, even if the user scrolls the canvas out of view or anges to another tab andthen comes ba later. If you’re developing a canvas-based application with more complicatedgraphics (su as an arcade game), you can optimize performance by traing whi regions ofthe canvas are “dirty” and redrawing just the dirty regions. But that is outside the scope of thisbook. gDrawingContext.clearRect(0, 0, kPixelWidth, kPixelHeight);e board-drawing routine should look familiar. It’s similar to how we drew the canvascoordinates diagram earlier in this apter. gDrawingContext.beginPath(); /* vertical lines */ for (var x = 0; x <= kPixelWidth; x += kPieceWidth) { gDrawingContext.moveTo(0.5 + x, 0);diveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

132.
gDrawingContext.lineTo(0.5 + x, kPixelHeight); } /* horizontal lines */ for (var y = 0; y <= kPixelHeight; y += kPieceHeight) { gDrawingContext.moveTo(0, 0.5 + y); gDrawingContext.lineTo(kPixelWidth, 0.5 + y); } /* draw it! */ gDrawingContext.strokeStyle = "#ccc"; gDrawingContext.stroke();e real fun begins when we go to draw ea of the individual pieces. A piece is a circle,something we haven’t drawn before. Furthermore, if the user selects a piece in anticipation ofmoving it, we want to draw that piece as a ﬁlled-in circle. Here, the argument p represents apiece, whi has row and column properties that denote the piece’s current location on theboard. We use some in-game constants to translate (column, row) into canvas-relative (x,y) coordinates, then draw a circle, then (if the piece is selected) ﬁll in the circle with a solidcolor. function drawPiece(p, selected) { var column = p.column; var row = p.row; var x = (column * kPieceWidth) + (kPieceWidth/2); var y = (row * kPieceHeight) + (kPieceHeight/2); var radius = (kPieceWidth/2) - (kPieceWidth/10);at’s the end of the game-speciﬁc logic. Now we have (x, y) coordinates, relative to thecanvas, for the center of the circle we want to draw. ere is no circle() method in thecanvas API, but there is an arc() method. And really, what is a circle but an arc that goesall the way around? Do you remember your basic geometry? e arc() method takes acenter point (x, y), a radius, a start and end angle (in radians), and a direction ﬂag ( falsefor clowise, true for counter-clowise). You can use the Math module that’s built intoJavaScript to calculate radians. gDrawingContext.beginPath();diveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

133.
gDrawingContext.arc(x, y, radius, 0, Math.PI * 2, false); gDrawingContext.closePath();But wait! Nothing has been drawn yet. Like moveTo() and lineTo, the arc() method is a“pencil” method. To actually draw the circle, we need to set the strokeStyle and callstroke() to trace it in “ink.” gDrawingContext.strokeStyle = "#000"; gDrawingContext.stroke();What if the piece is selected? We can re-use the same path we created to draw the outline ofthe piece, to ﬁll in the circle with a solid color. if (selected) { gDrawingContext.fillStyle = "#000"; gDrawingContext.fill(); }And that’s… well, that’s prey mu it. e rest of the program is game-speciﬁc logic —distinguishing between valid and invalid moves, keeping tra of the number of moves,detecting whether the game is over. With 9 circles, a few straight lines, and 1 onclickhandler, we’ve created an entire game in <canvas>. Huzzah! ❧ FURTHER READING Canvas tutorial on Mozilla Developer Center HTML5 canvas — the basics, by Mihai Sucan CanvasDemos.com: demos, tools, and tutorials for the HTML canvas element e canvas element in the HTML5 dra standard ❧diveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

134.
is has been “Let’s Call It A Draw(ing Surface).” e full table of contents has more if you’dlike to keep reading.DID YOU KNOW? In association with Google Press, O’Reilly is distributing this book in a variety of formats, including paper, ePub, Mobi, and DRM-free PDF. e paid edition is called “HTML5: Up & Running,” and it is available now. is apter is included in the paid edition. If you liked this apter and want to show your appreciation, you can buy “HTML5: Up & Running” with this aﬃliate link or buy an electronic edition directly from O’Reilly. You’ll get a book, and I’ll get a bu. I do not currently accept direct donations. Copyright MMIX–MMX Mark Pilgrim powered by Google™ Searchdiveintohtml5.org LET’S CALL IT A DRAW(ING SURFACE)

135.
You are here: Home ‣ Dive Into HTML5 ‣ №6 . YOU ARE HERE (AND SO IS EVERYBODY ELSE) show table of contents ❧ DIVING IN eolocation is the art of ﬁguring out where you are in the world and (optionally) sharing that information with people you trust. ere is more than one way to ﬁgure out where you are — your IP address, your wireless network connection, whi cell tower your phone is talking to, or dedicated GPShardware that calculates latitude and longitude from information sent by satellites in the sky.ASK PROFESSOR MARKUP ☞ Q: Geolocation sounds scary. Can I turn it oﬀ? A: Privacy is an obvious concern when you’re talking about sharing your physical location with a remote web server. e geolocation API explicitly states: “User Agents must not senddiveintohtml5.org YOU ARE HERE (AND SO IS EVERYBODY ELSE)

136.
location information to Web sites without the express permission of the user.” In other words, sharing your location is always opt-in. If you don’t want to, you don’t have to. ❧ THE GEOLOCATION APIe geolocation API lets you share your location with trusted web sites. e latitude andlongitude are available to JavaScript on the page, whi in turn can send it ba to the remoteweb server and do fancy location-aware things like ﬁnding local businesses or showing yourlocation on a map.As you can see from the following table, the geolocation API is supported by most browserson the desktop and mobile devices. Additionally, some older browsers and devices can besupported by wrapper libraries, as we’ll see later in this apter. GEOLOCATION API SUPPORTIE FIREFOX SAFARI CHROME OPERA IPHONE ANDROID · 3.5+ 5.0+ 5.0+ 10.6+ 3.0+ 2.0+Along with support for the standard geolocation API, there are a plethora of device-speciﬁcAPIs on other mobile platforms. I’ll cover all that later in this apter. ❧ SHOW ME THE CODEe geolocation API centers around a new property on the global navigator object:diveintohtml5.org YOU ARE HERE (AND SO IS EVERYBODY ELSE)

137.
navigator.geolocation.e simplest use of the geolocation API looks like this: function get_location() { navigator.geolocation.getCurrentPosition(show_map); }at has no detection, no error handling, and no options. Your web application shouldprobably include at least the ﬁrst two of those. To detect support for the geolocation API, youcan use Modernizr:function get_location() {if (Modernizr.geolocation) { ⇜ I CANnavigator.geolocation.getCurrentPosition(show_map); HAS GEO?} else {// no native support; maybe try Gears?}}What you do without geolocation support is up to you. I’ll explain the Gears fallba optionin a minute, but ﬁrst I want to talk about what happens during that call togetCurrentPosition(). As I mentioned at the beginning of this apter , geolocationsupport is opt-in. at means your browser will never force you to reveal your currentphysical location to a remote server. e user experience diﬀers from browser to browser. InMozilla Firefox, calling the getCurrentPosition() function of the geolocation API willcause the browser to pop up an “infobar” at the top of the browser window. e infobar lookslike this:ere’s a lot going on here. You, as the end user, are told that a website wants to know your location are told whi website wants to know your location can cli through to Mozilla’s “Location-Aware Browsing” help page whi explains what the he is going ondiveintohtml5.org YOU ARE HERE (AND SO IS EVERYBODY ELSE)

138.
can oose to share your location can oose not to share your location can tell your browser to remember your oice (either way, share or don’t share) so you never see this infobar again on this websiteFurthermore, this infobar is non-modal, so it won’t prevent you from switing to another browser window or tab tab-speciﬁc, so it will disappear if you swit to another browser window or tab and reappear when you swit ba to the original tab unconditional, so there is no way for a website to bypass it bloing, so there is no ance that the website can determine your location while it’s waiting for your answerYou just saw the JavaScript code that causes this infobar to appear. It’s a single function callwhi takes a callba function (whi I called show_map). e call togetCurrentPosition() will return immediately, but that doesn’t mean that you haveaccess to the user’s location. e ﬁrst time you are guaranteed to have location information isin the callba function. e callba function looks like this: function show_map(position) { var latitude = position.coords.latitude; var longitude = position.coords.longitude; // lets show a map or do something interesting! }e callba function will be called with a single parameter, an object with two properties:coords and timestamp . e timestamp is just that, the date and time when the location wascalculated. (Since this is all happening asynronously, you can’t really know when that willhappen in advance. It might take some time for the user to read the infobar and agree toshare their location. Devices with dedicated GPS hardware may take some more time toconnect to a GPS satellite. And so on.) e coords object has properties like latitude andlongitude whi are exactly what they sound like: the user’s physical location in the world. POSITION OBJECT Property Type Notes coords.latitude double decimal degrees coords.longitude double decimal degreesdiveintohtml5.org YOU ARE HERE (AND SO IS EVERYBODY ELSE)

139.
coords.altitude double or null meters above the reference ellipsoid coords.accuracy double meters coords.altitudeAccuracy double or null meters coords.heading double or null degrees clowise from true north coords.speed double or null meters/second timestamp DOMTimeStamp like a Date() objectOnly three of the properties are guaranteed to be there(coords.latitude, coords.longitude, andcoords.accuracy). e rest might come ba null ,depending on the capabilities of your device and the baendpositioning server that it talks to. e heading and speedproperties are calculated based on the user’s previous position,if possible. ❧ HANDLING ERRORSGeolocation is complicated. ings can go wrong. I’ve mentioned the “user consent” anglealready. If your web application wants the user’s location but the user doesn’t want to give itto you, you’re screwed. e user always wins. But what does that look like in code? It lookslike the second argument to the getCurrentPosition() function: an error handlingcallba function. navigator.geolocation.getCurrentPosition( show_map, handle_error)If anything goes wrong, your error callba function will be called with a PositionErrorobject. POSITIONERROR OBJECT Property Type Notes code short an enumerated valuediveintohtml5.org YOU ARE HERE (AND SO IS EVERYBODY ELSE)

140.
message DOMString not intended for end userse code property will be one of PERMISSION_DENIED (1 ) if the user clis that “Don’t Share” buon or otherwise denies you access to their location. POSITION_UNAVAILABLE (2) if the network is down or the positioning satellites can’t be contacted. TIMEOUT (3) if the network is up but it takes too long to calculate the user’s position. How long is “too long”? I’ll show you how to deﬁne that in the next section. UNKNOWN_ERROR (0) if anything else goes wrong. ↶ Be gracious in defeat function handle_error(err) { if (err.code == 1) { // user said no! } }ASK PROFESSOR MARKUP ☞ Q: Does the geolocation API work on the International Space Station, on the moon, or on other planets? A: e geolocation speciﬁcation states , “e geographic coordinate reference system used by the aributes in this interface is the World Geodetic System (2d) [WGS84]. No other reference system is supported.” e International Space Station is orbiting Earth, so astronauts on the station can describe their location by latitude, longitude, and altitude. However, the World Geodetic System is Earth- centric, so it can’t be used to describe locations on the moon or on other planets.diveintohtml5.org YOU ARE HERE (AND SO IS EVERYBODY ELSE)

141.
❧ CHOICES! I DEMAND CHOICES!Some popular mobile devices — like the iPhone and Android phones — support two methodsof ﬁguring out where you are. e ﬁrst method triangulates your position based on yourrelative proximity to diﬀerent cellular towers operated by your phone carrier. is method isfast and doesn’t require any dedicated GPS hardware, but it only gives you a rough idea ofwhere you are. Depending on how many cell towers are in your area, “a rough idea” could beas lile as one city blo or as mu as a kilometer in every direction. e second method actually uses dedicated GPS hardware on your device to talk to dedicated GPS positioning satellites that are orbiting the Earth. GPS can usually pinpoint your location within a few meters. e downside is that the dedicated GPS ip on your device draws a lot of power, so phones and other general purpose mobile devices usually turn oﬀ the ip until it’s needed. at means there will be a startup delay while the ip is initializing its connectionwith the GPS satellites in the sky. If you’ve ever used Google Maps on an iPhone or othersmartphone, you’ve seen both methods in action. First you see a large circle thatapproximates your position (ﬁnding the nearest cell tower), then a smaller circle (triangulatingwith other cell towers), then a single dot with an exaction position (given by GPS satellites).e reason I mention this is that, depending on your web application, you may not need highaccuracy. If you’re just looking for nearby movie listings, a “low accuracy” location isprobably good enough. ere aren’t that many movie theaters, even in dense cities, and you’llprobably be listing more than one of them anyway. On the other hand, if you’re giving turnby turn directions in real time, you really do need to know exactly where the user is so youdiveintohtml5.org YOU ARE HERE (AND SO IS EVERYBODY ELSE)

142.
by turn directions in real time, you really do need to know exactly where the user is so youcan say “turn right in 20 meters” or whatever.e getCurrentPosition() function has an optional third argument, aPositionOptions object. ere are three properties you can set in a PositionOptionsobject. All the properties are optional. You can set any or all or none of them. POSITIONOPTIONS OBJECT Property Type Default Notes enableHighAccuracy Boolean false true might be slower timeout long (no default) in milliseconds maximumAge long 0 in millisecondse enableHighAccuracy property is exactly what it sounds like. If true, and the devicecan support it, and the user consents to sharing their exact location, then the device will try toprovide it. Both iPhones and Android phones have separate permissions for low- and high-accuracy positioning, so it is possible that calling getCurrentPosition() withenableHighAccuracy:true will fail, but calling with enableHighAccuracy:falsewould succeed.e timeout property is the number of milliseconds your web application is willing to waitfor a position. is timer doesn’t start counting down until aer the user gives permission toeven try to calculate their position. You’re not timing the user; you’re timing the network.e maximumAge property allows the device to answer immediately with a caed position.For example, let’s say you call getCurrentPosition() for the ﬁrst time, the userconsents, and your success callba function is called with a position that was calculated atexactly 10:00 AM. Exactly one minute later, at 10:01 AM, you call getCurrentPosition()again with a maximumAge property of 75000. navigator.geolocation.getCurrentPosition( success_callback, error_callback, {maximumAge: 75000});What you’re saying is that you don’t necessarily need the user’s current location. You wouldbe satisﬁed with knowing where they were 75 seconds ago (75000 milliseconds). e deviceknows where the user was 60 seconds ago (60000 milliseconds), because it calculated theirlocation aer the ﬁrst time you called getCurrentPosition(). So the device doesn’tbother to recalculate the user’s current location. It just returns exactly the same information itdiveintohtml5.org YOU ARE HERE (AND SO IS EVERYBODY ELSE)

143.
returned the ﬁrst time: same latitude and longitude, same accuracy, and same timestamp (10:00AM).Before you ask for the user’s location, you should think aboutjust how mu accuracy you need, and setenableHighAccuracy accordingly. If you need to ﬁnd theirlocation more than once, you should think about how old theinformation could be and still be useful, and set maximumAgeaccordingly. If you need to ﬁnd their location continuously,then getCurrentPosition() is not for you. You need toupgrade to watchPosition().e watchPosition() function has the same structure asgetCurrentPosition(). It takes two callba functions, arequired one for success and an optional one for errorconditions, and it can also take an optionalPositionOptions object that has all the same propertiesyou just learned about. e diﬀerence is that your callbafunction will be called every time the user’s location anges .ere is no need to actively poll their position. e device willdetermine the optimal polling interval, and it will call yourcallba function whenever it determines that the user’sposition has anged. You can use this to update a visiblemarker on a map, provide instructions on where to go next, or whatever you like. It’s entirelyup to you.e watchPosition() function itself returns a number. You should probably store thisnumber somewhere. If you ever want to stop wating the user’s location ange, you cancall the clearWatch() method and pass it this number, and the device will stop calling yourcallba function. If you’ve ever used the setInterval() and clearInterval()functions in JavaScript, this works the same way. ❧ WHAT ABOUT IE?diveintohtml5.org YOU ARE HERE (AND SO IS EVERYBODY ELSE)

144.
Internet Explorer does not support the W3C geolocation API that I’ve just described. Butdon’t despair! Gears is an open source browser plugin from Google that works on Windows,Mac, Linux, Windows Mobile, and Android. It provides features for older browsers. One ofthe features that Gears provides is a geolocation API. It’s not quite the same as the W3Cgeolocation API, but it serves the same purpose.While we’re on the subject of legacy platforms, I should point out that many older mobilephone platforms had their own device-speciﬁc geolocation APIs. BlaBerry, Nokia, Palm, andOMTP BONDI all provide their own geolocation APIs. Of course, they all work diﬀerentlyfrom Gears, whi in turn works diﬀerently from the W3C geolocation API. Wheeeeee! ❧ GEO.JS TO THE RESCUEgeo.js is an open source, MIT-licensed JavaScript library that smooths over the diﬀerencesbetween the W3C geolocation API, the Gears API, and the APIs provided by mobileplatforms. To use it, you’ll need to add two <script> elements at the boom of your page.(Tenically, you could put them anywhere, but scripts in your <head> will make your pageload more slowly. So don’t do that!)e ﬁrst script is gears_init.js, whi initializes Gears if it’s installed. e second scriptis geo.js. <!DOCTYPE html> <html> <head> <meta charset="utf-8"> <title>Dive Into HTML5</title> </head> <body> ... <script src="gears_init.js"></script> <script src="geo.js"></script> ⇜ Don’t let it godiveintohtml5.org YOU ARE HERE (AND SO IS EVERYBODY ELSE)

145.
<script src="geo.js"></script> </body> to your <head> </html>Now you’re ready to use whiever geolocation API is installed. if (geo_position_js.init()) { geo_position_js.getCurrentPosition(geo_success, geo_error); }Let’s take that one step at a time. First, you need to explicitly call an init() function. einit() function returns true if a supported geolocation API is available. if (geo_position_js.init()) {Calling the init() function does not actually ﬁnd your location. It just veriﬁes that ﬁndingyour location is possible. To actually ﬁnd your location, you need to call thegetCurrentPosition() function. geo_position_js.getCurrentPosition(geo_success, geo_error);e getCurrentPosition() function will trigger your browser to ask for your permissionto ﬁnd and share your location. If geolocation is being provided by Gears, this will pop up adialog asking if your trust the web site to use Gears. If your browser natively supports thegeolocation API, the dialog will look diﬀerent. For example, Firefox 3.5 natively supports thegeolocation API. If you try to ﬁnd your location in Firefox 3.5, it will display an infobar atthe top of the page asking whether you want to share your location with this web site.e getCurrentPosition() function takes two callba functions as arguments. If thegetCurrentPosition() function was successful in ﬁnding your location — that is, yougave your permission and the geolocation API actually worked its magic — it will call thefunction passed in as the ﬁrst argument. In this example, the success callba function iscalled geo_success. geo_position_js.getCurrentPosition(geo_success, geo_error);e success callba function takes a single argument, whi contains the positioninformation.diveintohtml5.org YOU ARE HERE (AND SO IS EVERYBODY ELSE)

146.
↶ Success callback function geo_success(p) { alert("Found you at latitude " + p.coords.latitude + ", longitude " + p.coords.longitude); }If the getCurrentPosition() function could not ﬁnd your location — either because youdeclined to give your permission, or the geolocation API failed for some reason — it will callthe function passed in as the second argument. In this example, the failure callba function iscalled geo_error. geo_position_js.getCurrentPosition(geo_success, geo_error);e failure callba function takes no arguments. ↶ Failure callback function geo_error() { alert("Could not find you!"); }geo.js does not currently support the watchPosition() function. If you need continuouslocation information, you’ll need to actively poll getCurrentPosition() yourself. ❧ A COMPLETE, LIVE EXAMPLEHere is a live example of using geo.js to aempt to get your location and display a map ofyour immediate surroundings:diveintohtml5.org YOU ARE HERE (AND SO IS EVERYBODY ELSE)

147.
Your browser does not support geolocation. :(How does it work? Let’s take a look. On page load, this page callsgeo_position_js.init() to determine whether geolocation is available through any ofthe interfaces that geo.js supports. If so, it sets up a link you can cli to look up yourlocation. Cliing that link calls the lookup_location() function, shown here: function lookup_location() { geo_position_js.getCurrentPosition(show_map, show_map_error); }If you give your consent to tra your location, and the baend service was actually able todetermine your location, geo.js calls the ﬁrst callba function, show_map(), with a singleargument, loc. e loc object has a coords property whi contains latitude, longitude, andaccuracy information. (is example doesn’t use the accuracy information.) e rest of theshow_map() function uses the Google Maps API to set up an embedded map. function show_map(loc) { $("#geo-wrapper").css({width:320px,height:350px}); var map = new GMap2(document.getElementById("geo-wrapper")); var center = new GLatLng(loc.coords.latitude, loc.coords.longitude); map.setCenter(center, 14); map.addControl(new GSmallMapControl()); map.addControl(new GMapTypeControl());diveintohtml5.org YOU ARE HERE (AND SO IS EVERYBODY ELSE)

149.
paper, ePub, Mobi, and DRM-free PDF. e paid edition is called “HTML5: Up & Running,” and it is available now. is apter is included in the paid edition. If you liked this apter and want to show your appreciation, you can buy “HTML5: Up & Running” with this aﬃliate link or buy an electronic edition directly from O’Reilly. You’ll get a book, and I’ll get a bu. I do not currently accept direct donations. Copyright MMIX–MMX Mark Pilgrim powered by Google™ Searchdiveintohtml5.org YOU ARE HERE (AND SO IS EVERYBODY ELSE)

150.
You are here: Home ‣ Dive Into HTML5 ‣ №7 . THE PAST, PRESENT & FUTURE OF LOCAL STORAGE FOR WEB APPLICATIONS show table of contents ❧ DIVING IN ersistent local storage is one of the areas where native client applications have held an advantage over web applications. For native applications, the operating system typically provides an abstraction layer for storing and retrieving application-speciﬁc data like preferences or runtime state. ese values may bestored in the registry, INI ﬁles, XML ﬁles, or some other place according to platformconvention. If your native client application needs local storage beyond key/value pairs, youcan embed your own database, invent your own ﬁle format, or any number of othersolutions.Historically, web applications have had none of these luxuries. Cookies were invented early inthe web’s history, and indeed they can be used for persistent local storage of small amountsof data. But they have three potentially dealbreaking downsides:diveintohtml5.org THE PAST, PRESENT & FUTURE OF LOCAL STORAGE FOR WEB APPLICATIONS

151.
Cookies are included with every HTTP request, thereby slowing down your web application by needlessly transmiing the same data over and over Cookies are included with every HTTP request, thereby sending data unencrypted over the internet (unless your entire web application is served over SSL) Cookies are limited to about 4 KB of data — enough to slow down your application (see above), but not enough to be terribly usefulWhat we really want is a lot of storage space on the client that persists beyond a page refresh and isn’t transmied to the serverBefore HTML5, all aempts to aieve this were ultimately unsatisfactory in diﬀerent ways. ❧ A BRIEF HISTORY OF LOCAL STORAGE HACKS BEFORE HTML5In the beginning, there was only Internet Explorer. Or at least, that’s what Microso wantedthe world to think. To that end, as part of the First Great Browser Wars , Microso invented agreat many things and included them in their browser-to-end-all-browser-wars, InternetExplorer. One of these things was called DHTML Behaviors, and one of these behaviors wascalled userData.userData allows web pages to store up to 64 KB of data per domain, in a hierarical XML-based structure. (Trusted domains, su as intranet sites, can store 10 times that amount. Andhey, 640 KB ought to be enough for anybody .) IE does not present any form of permissionsdialog, and there is no allowance for increasing the amount of storage available.In 2002, Adobe introduced a feature in Flash 6 that gained the unfortunate and misleadingdiveintohtml5.org THE PAST, PRESENT & FUTURE OF LOCAL STORAGE FOR WEB APPLICATIONS

152.
name of “Flash cookies.” Within the Flash environment, the feature is properly known as LocalShared Objects. Brieﬂy, it allows Flash objects to store up to 100 KB of data per domain. BradNeuberg developed an early prototype of a Flash-to-JavaScript bridge called AMASS (AJAXMassive Storage System), but it was limited by some of Flash’s design quirks. By 2006, withthe advent of ExternalInterface in Flash 8, accessing LSOs from JavaScript became an order ofmagnitude easier and faster. Brad rewrote AMASS and integrated it into the popular DojoToolkit under the moniker dojox.storage. Flash gives ea domain 100 KB of storage “forfree.” Beyond that, it prompts the user for ea order of magnitude increase in data storage (1Mb, 10 Mb, and so on).In 2007, Google launed Gears, an open source browser plugin aimed at providing additionalcapabilities in browsers. (We’ve previously discussed Gears in the context of providing ageolocation API in Internet Explorer. Gears provides an API to an embedded SQL databasebased on SQLite. Aer obtaining permission from the user once, Gears can store unlimitedamounts of data per domain in SQL database tables.In the meantime, Brad Neuberg and others continued to ha away on dojox.storage toprovide a uniﬁed interface to all these diﬀerent plugins and APIs. By 2009, dojox.storagecould auto-detect (and provide a uniﬁed interface on top of) Adobe Flash, Gears, Adobe AIR,and an early prototype of HTML5 storage that was only implemented in older versions ofFirefox.As you survey these solutions, a paern emerges: all of them are either speciﬁc to a singlebrowser, or reliant on a third-party plugin. Despite heroic eﬀorts to paper over the diﬀerences(in dojox.storage), they all expose radically diﬀerent interfaces, have diﬀerent storagelimitations, and present diﬀerent user experiences. So this is the problem that HTML5 set outto solve: to provide a standardized API, implemented natively and consistently in multiplebrowsers, without having to rely on third-party plugins. ❧ INTRODUCING HTML5 STORAGEdiveintohtml5.org THE PAST, PRESENT & FUTURE OF LOCAL STORAGE FOR WEB APPLICATIONS

153.
What I will refer to as “ HTML5 Storage” is a speciﬁcation named Web Storage, whi was atone time part of the HTML5 speciﬁcation proper, but was split out into its own speciﬁcationfor uninteresting political reasons. Certain browser vendors also refer to it as “Local Storage”or “DOM Storage.” e naming situation is made even more complicated by some related,similarly-named, emerging standards that I’ll discuss later in this apter.So what is HTML5 Storage? Simply put, it’s a way for web pages to store named key/valuepairs locally, within the client web browser. Like cookies, this data persists even aer younavigate away from the web site, close your browser tab, exit your browser, or what haveyou. Unlike cookies, this data is never transmied to the remote web server (unless you goout of your way to send it manually). Unlike all previous aempts at providing persistentlocal storage, it is implemented natively in web browsers, so it is available even when third-party browser plugins are not.Whi browsers? Well, the latest version of prey mu every browser supports HTML5Storage… even Internet Explorer! HTML5 STORAGE SUPPORT IE FIREFOX SAFARI CHROME OPERA IPHONE ANDROID 8.0+ 3.5+ 4.0+ 4.0+ 10.5+ 2.0+ 2.0+From your JavaScript code, you’ll access HTML5 Storage through the localStorage objecton the global window object. Before you can use it, you should detect whether the browsersupports it. ↶ check for HTML5 Storage function supports_html5_storage() { try { return localStorage in window && window[localStorage] !== null; } catch (e) { return false; } }diveintohtml5.org THE PAST, PRESENT & FUTURE OF LOCAL STORAGE FOR WEB APPLICATIONS

154.
Instead of writing this function yourself, you can use Modernizr to detect support for HTML5Storage. if (Modernizr.localstorage) { // window.localStorage is available! } else { // no native support for HTML5 storage :( // maybe try dojox.storage or a third-party solution } ❧ USING HTML5 STORAGEHTML5 Storage is based on named key/value pairs. You store data based on a named key,then you can retrieve that data with the same key. e named key is a string. e data can beany type supported by JavaScript, including strings, Booleans, integers, or ﬂoats. However, thedata is actually stored as a string. If you are storing and retrieving anything other thanstrings, you will need to use functions like parseInt() or parseFloat() to coerce yourretrieved data into the expected JavaScript datatype. interface Storage { getter any getItem(in DOMString key); setter creator void setItem(in DOMString key, in any data); };Calling setItem() with a named key that already exists will silently overwrite the previousvalue. Calling getItem() with a non-existent key will return null rather than throw anexception.Like other JavaScript objects, you can treat the localStorage object as an associative array.Instead of using the getItem() and setItem() methods, you can simply use squarebraets. For example, this snippet of code:diveintohtml5.org THE PAST, PRESENT & FUTURE OF LOCAL STORAGE FOR WEB APPLICATIONS

155.
var foo = localStorage.getItem("bar"); // ... localStorage.setItem("bar", foo);…could be rewrien to use square braet syntax instead: var foo = localStorage["bar"]; // ... localStorage["bar"] = foo;ere are also methods for removing the value for a given named key, and clearing the entirestorage area (that is, deleting all the keys and values at once). interface Storage { deleter void removeItem(in DOMString key); void clear(); };Calling removeItem() with a non-existent key will do nothing.Finally, there is a property to get the total number of values in the storage area, and to iteratethrough all of the keys by index (to get the name of ea key). interface Storage { readonly attribute unsigned long length; getter DOMString key(in unsigned long index); };If you call key() with an index that is not between 0–( length-1), the function will returnnull. TRACKING CHANGES TO THE HTML5 STORAGE AREAIf you want to keep tra programmatically of when the storage area anges, you can trapthe storage event. e storage event is ﬁred on the window object wheneverdiveintohtml5.org THE PAST, PRESENT & FUTURE OF LOCAL STORAGE FOR WEB APPLICATIONS

156.
setItem(), removeItem(), or clear() is called and actually anges something . Forexample, if you set an item to its existing value or call clear() when there are no namedkeys, the storage event will not ﬁre, because nothing actually anged in the storage area.e storage event is supported everywhere the localStorage object is supported, whiincludes Internet Explorer 8. IE 8 does not support the W3C standard addEventListener(although that will ﬁnally be added in IE 9). erefore, to hook the storage event, you’llneed to e whi event meanism the browser supports. (If you’ve done this before withother events, you can skip to the end of this section. Trapping the storage event works thesame as every other event you’ve ever trapped. If you prefer to use jery or some otherJavaScript library to register your event handlers, you can do that with the storage event,too.) if (window.addEventListener) { window.addEventListener("storage", handle_storage, false); } else { window.attachEvent("onstorage", handle_storage); };e handle_storage callba function will be called with a StorageEvent object, exceptin Internet Explorer where the event object is stored in window.event. function handle_storage(e) { if (!e) { e = window.event; } }At this point, the variable e will be a StorageEvent object, whi has the following usefulproperties. STORAGEEVENT OBJECT PROPERTY TYPE DESCRIPTION TYPEDESCRIPTION key string the named key that was added, removed, or modiﬁed oldValue any the previous value (now overwrien), or null if a new item was added newValue any the new value, or null if an item was removed url* string the page whi called a method that triggered this ange* Note: the url property was originally called uri. Some browsers shipped with that property before the speciﬁcation anged.diveintohtml5.org THE PAST, PRESENT & FUTURE OF LOCAL STORAGE FOR WEB APPLICATIONS

157.
For maximum compatibility, you should e whether the url property exists, and if not, e for the uri property instead.e storage event is not cancelable. From within the handle_storage callba function,there is no way to stop the ange from occurring. It’s simply a way for the browser to tellyou, “hey, this just happened. ere’s nothing you can do about it now; I just wanted to letyou know.” LIMITATIONS IN CURRENT BROWSERSIn talking about the history of local storage has using third-party plugins, I made a point ofmentioning the limitations of ea tenique, su as storage limits. I just realized that Ihaven’t mentioned anything about the limitations of the now-standardized HTML5 Storage. I’llgive you the answers ﬁrst, then explain them. e answers, in order of importance, are “5megabytes,” “QUOTA_EXCEEDED_ERR,” and “no.”“5 megabytes” is how mu storage space ea origin gets by default. is is surprisinglyconsistent across browsers, although it is phrased as no more than a suggestion in the HTML5Storage speciﬁcation. One thing to keep in mind is that you’re storing strings, not data in itsoriginal format. If you’re storing a lot of integers or ﬂoats, the diﬀerence in representationcan really add up. Ea digit in that ﬂoat is being stored as a aracter, not in the usualrepresentation of a ﬂoating point number.“QUOTA_EXCEEDED_ERR” is the exception that will get thrown if you exceed your storagequota of 5 megabytes. “No” is the answer to the next obvious question, “Can I ask the userfor more storage space?” At time of writing, no browser supports any meanism for webdevelopers to request more storage space. Some browsers (like Opera) allow the user tocontrol ea site’s storage quota, but it is purely a user-initiated action, not something thatyou as a web developer can build into your web application. ❧ HTML5 STORAGE IN ACTIONdiveintohtml5.org THE PAST, PRESENT & FUTURE OF LOCAL STORAGE FOR WEB APPLICATIONS

158.
Let’s see HTML5 Storage in action. Recall the Halma game we constructed in the canvasapter. ere’s a small problem with the game: if you close the browser window mid-game,you’ll lose your progress. But with HTML5 Storage, we can save the progress locally, withinthe browser itself. Here is a live demonstration. Make a few moves, then close the browsertab, then re-open it. If your browser supports HTML5 Storage, the demonstration page shouldmagically remember your exact position within the game, including the number of movesyou’ve made, the position of ea of the pieces on the board, and even whether a particularpiece is selected.How does it work? Every time a ange occurs within the game, we call this function: function saveGameState() { if (!supportsLocalStorage()) { return false; } localStorage["halma.game.in.progress"] = gGameInProgress; for (var i = 0; i < kNumPieces; i++) { localStorage["halma.piece." + i + ".row"] = gPieces[i].row; localStorage["halma.piece." + i + ".column"] = gPieces[i].column; } localStorage["halma.selectedpiece"] = gSelectedPieceIndex; localStorage["halma.selectedpiecehasmoved"] = gSelectedPieceHasMoved; localStorage["halma.movecount"] = gMoveCount; return true; }As you can see, it uses the localStorage object to save whether there is a game inprogress (gGameInProgress, a Boolean). If so, it iterates through the pieces ( gPieces, aJavaScript Array) and saves the row and column number of ea piece. en it saves someadditional game state, including whi piece is selected (gSelectedPieceIndex, an integer),whether the piece is in the middle of a potentially long series of hops(gSelectedPieceHasMoved, a Boolean), and the total number of moves made so far(gMoveCount, an integer).On page load, instead of automatically calling a newGame() function that would reset thesevariables to hard-coded values, we call a resumeGame() function instead. Using HTML5diveintohtml5.org THE PAST, PRESENT & FUTURE OF LOCAL STORAGE FOR WEB APPLICATIONS

159.
Storage, the resumeGame() function es whether a state about a game-in-progress isstored locally. If so, it restores those values using the localStorage object. function resumeGame() { if (!supportsLocalStorage()) { return false; } gGameInProgress = (localStorage["halma.game.in.progress"] == "true"); if (!gGameInProgress) { return false; } gPieces = new Array(kNumPieces); for (var i = 0; i < kNumPieces; i++) { var row = parseInt(localStorage["halma.piece." + i + ".row"]); var column = parseInt(localStorage["halma.piece." + i + ".column"]); gPieces[i] = new Cell(row, column); } gNumPieces = kNumPieces; gSelectedPieceIndex = parseInt(localStorage["halma.selectedpiece"]); gSelectedPieceHasMoved = localStorage["halma.selectedpiecehasmoved"] == "true"; gMoveCount = parseInt(localStorage["halma.movecount"]); drawBoard(); return true; }e most important part of this function is the caveat that I mentioned earlier in this apter,whi I’ll repeat here: Data is stored as strings. If you are storing something other than astring, you’ll need to coerce it yourself when you retrieve it. For example, the ﬂag forwhether there is a game in progress (gGameInProgress) is a Boolean. In thesaveGameState() function, we just stored it and didn’t worry about the datatype: localStorage["halma.game.in.progress"] = gGameInProgress;But in the resumeGame() function, we need to treat the value we got from the local storagearea as a string and manually construct the proper Boolean value ourselves:diveintohtml5.org THE PAST, PRESENT & FUTURE OF LOCAL STORAGE FOR WEB APPLICATIONS

160.
gGameInProgress = (localStorage["halma.game.in.progress"] == "true");Similarly, the number of moves is stored in gMoveCount as an integer. In thesaveGameState() function, we just stored it: localStorage["halma.movecount"] = gMoveCount;But in the resumeGame() function, we need to coerce the value to an integer, using theparseInt() function built into JavaScript: gMoveCount = parseInt(localStorage["halma.movecount"]); ❧ BEYOND NAMED KEY-VALUE PAIRS: COMPETING VISIONSWhile the past is liered with has and workarounds , the present condition of HTML5Storage is surprisingly rosy. A new API has been standardized and implemented across allmajor browsers, platforms, and devices. As a web developer, that’s just not something yousee every day, is it? But there is more to life than “5 megabytes of named key/value pairs,”and the future of persistent local storage is… how shall I put it… well, there are competingvisions.One vision is an acronym that you probably know already: SQL. In 2007, Google launedGears, an open source cross-browser plugin whi included an embedded database based onSQLite. is early prototype later inﬂuenced the creation of the Web SQL Databasespeciﬁcation. Web SQL Database (formerly known as “WebDB”) provides a thin wrapperaround a SQL database, allowing you to do things like this from JavaScript:diveintohtml5.org THE PAST, PRESENT & FUTURE OF LOCAL STORAGE FOR WEB APPLICATIONS

161.
↶ actual working code in 4 browsers openDatabase(documents, 1.0, Local document storage, 5*1024*1024, function (db) { db.changeVersion(, 1.0, function (t) { t.executeSql(CREATE TABLE docids (id, name)); }, error); });As you can see, most of the action resides in the string you pass to the executeSqlmethod. is string can be any supported SQL statement, including SELECT, UPDATE,INSERT, and DELETE statements. It’s just like baend database programming, except you’redoing it from JavaScript! Oh joy!e Web SQL Database speciﬁcation has been implemented by four browsers and platforms. WEB SQL DATABASE SUPPORTIE FIREFOX SAFARI CHROME OPERA IPHONE ANDROID · · 4.0+ 4.0+ 10.5+ 3.0+ 2.0+Of course, if you’ve used more than one database product in your life, you are aware that“SQL” is more of a marketing term than a hard-and-fast standard. (Some would say the sameof “HTML5,” but never mind that.) Sure, there is an actual SQL speciﬁcation (it’s called SQL-92), but there is no database server in the world that conforms to that and only thatspeciﬁcation. ere’s Oracle’s SQL, Microso’s SQL, MySQL’s SQL, PostgreSQL’s SQL, andSQLite’s SQL. Indeed, ea of these products adds new SQL features over time, so evensaying “SQLite’s SQL” is not suﬃcient to pin down exactly what you’re talking about. Youneed to say “the version of SQL that shipped with SQLite version X.Y.Z.”All of whi brings us to the following disclaimer, currently residing at the top of the WebSQL Database speciﬁcation: is speciﬁcation has reaed an impasse: all interested implementors have used the same SQL baend (Sqlite), but we need multiple independent implementations to proceed along a standardisation path. Until another implementor is interested in implementing this spec, the description of the SQL dialect has been le as simplydiveintohtml5.org THE PAST, PRESENT & FUTURE OF LOCAL STORAGE FOR WEB APPLICATIONS

162.
a reference to Sqlite, whi isnt acceptable for a standard.It is against this badrop that I will introduce you to another competing vision for advanced,persistent, local storage for web applications: the Indexed Database API, formerly known as“WebSimpleDB,” now aﬀectionately known as “IndexedDB.”e Indexed Database API exposes what’s called an object store. An object store shares manyconcepts with a SQL database. ere are “databases” with “records,” and ea record has a setnumber of “ﬁelds.” Ea ﬁeld has a speciﬁc datatype, whi is deﬁned when the database iscreated. You can select a subset of records, then enumerate them with a “cursor.” Changes tothe object store are handled within “transactions.”If you’ve done any SQL database programming, these terms probably sound familiar. eprimary diﬀerence is that the object store has no structured query language. You don’tconstruct a statement like "SELECT * from USERS where ACTIVE = Y". Instead, youuse methods provided by the object store to open a cursor on the database named “USERS,”enumerate through the records, ﬁlter out records for inactive users, and use accessor methodsto get the values of ea ﬁeld in the remaining records. An early walk-through of IndexedDBis a good tutorial of how IndexedDB works, giving side-by-side comparisons of IndexedDBand Web SQL Database.At time of writing, IndexedDB has only been implemented in a beta version of Firefox 4 . (Bycontrast, Mozilla has stated that they will never implement Web SQL Database .) Google hasstated that they are considering IndexedDB support for Chromium and Google Chrome. Andeven Microso has said that IndexedDB “is a great solution for the web .”So what can you, as a web developer, do with IndexedDB? At the moment, virtually nothingbeyond some tenology demos. A year from now? Maybe something. Che the “FurtherReading” section for links to some good tutorials to get you started. ❧ FURTHER READINGdiveintohtml5.org THE PAST, PRESENT & FUTURE OF LOCAL STORAGE FOR WEB APPLICATIONS

163.
HTML5 storage: HTML5 Storage speciﬁcation Introduction to DOM Storage on MSDN Web Storage: easier, more powerful client-side data storage on Opera Developer Community DOM Storage on Mozilla Developer Center. (Note: most of this page is devoted to Firefox’s prototype implementation of a globalStorage object, a non-standard precursor to localStorage. Mozilla added support for the standard localStorage interface in Firefox 3.5.) Unlo local storage for mobile Web applications with HTML5 , a tutorial on IBM DeveloperWorksEarly work by Brad Neuberg et. al. (pre-HTML5): Internet Explorer Has Native Support for Persistence⁈⁈ (about the userData object in IE) Dojo Storage, part of a larger tutorial about the (now-defunct) Dojo Oﬄine library dojox.storage.manager API reference dojox.storage Subversion repositoryWeb SQL Database: Web SQL Database speciﬁcation Introducing Web SQL Databases Web Database demonstration persistence.js, an “asynronous JavaScript ORM” built on top of Web SQL Database and GearsIndexedDB: Indexed Database API speciﬁcation Beyond HTML5: Database APIs and the Road to IndexedDB Firefox 4: An early walk-through of IndexedDBdiveintohtml5.org THE PAST, PRESENT & FUTURE OF LOCAL STORAGE FOR WEB APPLICATIONS

164.
❧is has been “e Past, Present & Future of Local Storage for Web Applications.” e fulltable of contents has more if you’d like to keep reading.DID YOU KNOW? In association with Google Press, O’Reilly is distributing this book in a variety of formats, including paper, ePub, Mobi, and DRM-free PDF. e paid edition is called “HTML5: Up & Running,” and it is available now. is apter is included in the paid edition. If you liked this apter and want to show your appreciation, you can buy “HTML5: Up & Running” with this aﬃliate link or buy an electronic edition directly from O’Reilly. You’ll get a book, and I’ll get a bu. I do not currently accept direct donations. Copyright MMIX–MMX Mark Pilgrim powered by Google™ Searchdiveintohtml5.org THE PAST, PRESENT & FUTURE OF LOCAL STORAGE FOR WEB APPLICATIONS

165.
You are here: Home ‣ Dive Into HTML5 ‣ №8 . LET’S TAKE THIS OFFLINE show table of contents ❧ DIVING IN hat is an oﬄine web application? At ﬁrst glance, it sounds like a contradiction in terms. Web pages are things you download and render. Downloading implies a network connection. How can you download when you’re oﬄine? Of course, you can’t. But you can download when you’re online. And that’s how HTML5 oﬄine applications work.At its simplest, an oﬄine web application is a list of URLs — HTML, CSS, JavaScript, images,or any other kind of resource. e home page of the oﬄine web application points to this list,called a manifest ﬁle, whi is just a text ﬁle located elsewhere on the web server. A webbrowser that implements HTML5 oﬄine applications will read the list of URLs from themanifest ﬁle, download the resources, cae them locally, and automatically keep the localcopies up to date as they ange. When the time comes that you try to access the webapplication without a network connection, your web browser will automatically swit over tothe local copies instead.From there, most of the work is up to you, the web developer. ere’s a ﬂag in the DOM thatdiveintohtml5.org LET’S TAKE THIS OFFLINE

166.
will tell you whether you’re online or oﬄine. ere are events that ﬁre when your oﬄinestatus anges (one minute you’re oﬄine and the next minute you’re online, or vice-versa).But that’s prey mu it. If your application creates data or saves state, it’s up to you to storethat data locally while you’re oﬄine and synronize it with the remote server once you’reba online. In other words, HTML5 can take your web application oﬄine. What you do onceyou’re there is up to you. OFFLINE SUPPORTIE FIREFOX SAFARI CHROME OPERA IPHONE ANDROID · ✓ ✓ ✓ · ✓ ✓ ❧ THE CACHE MANIFESTAn oﬄine web application revolves around a cae manifest ﬁle. What’s a manifest ﬁle? It’s alist of all of the resources that your web application might need to access while it’sdisconnected from the network. In order to bootstrap the process of downloading and caingthese resources, you need to point to the manifest ﬁle, using a manifest aribute on your<html> element. <!DOCTYPE HTML> <html manifest="/cache.manifest"> <body> ... </body> </html>Your cae manifest ﬁle can be located anywhere on your web server, but it must be servedwith the content type text/cache-manifest. If you are running an Apae-based webserver, you can probably just put an AddType directive in the .htaccess ﬁle at the root ofyour web directory: AddType text/cache-manifest .manifestdiveintohtml5.org LET’S TAKE THIS OFFLINE

167.
en make sure that the name of your cae manifest ﬁle ends with .manifest. If you usea diﬀerent web server or a diﬀerent conﬁguration of Apae, consult your server’sdocumentation on controlling the Content-Type header.ASK PROFESSOR MARKUP ☞ Q: My web application spans more than one page. Do I need a manifest aribute in ea page, or can I just put it in the home page? A: Every page of your web application needs a manifest aribute that points to the cae manifest for the entire application.OK, so every one of your HTML pages points to your cae manifest ﬁle, and your caemanifest ﬁle is being served with the proper Content-Type header. But what goes in themanifest ﬁle? is is where things get interesting.e ﬁrst line of every cae manifest ﬁle is this: CACHE MANIFESTAer that, all manifest ﬁles are divided into three parts: the “explicit” section, the “fallba”section, and the “online whitelist” section. Ea section has a header, on its own line. If themanifest ﬁle doesn’t have any section headers, all the listed resources are implicitly in the“explicit” section. Try not to dwell on the terminology, lest your head explode.Here is a valid manifest ﬁle. It lists three resources: a CSS ﬁle, a JavaScript ﬁle, and a JPEGimage.diveintohtml5.org LET’S TAKE THIS OFFLINE

168.
CACHE MANIFEST /clock.css /clock.js /clock-face.jpgis cae manifest ﬁle has no section headers, so all the listed resources are in the “explicit”section by default. Resources in the “explicit” section will get downloaded and caed locally,and will be used in place of their online counterparts whenever you are disconnected from thenetwork. us, upon loading this cae manifest ﬁle, your browser would downloadclock.css , clock.js, and clock-face.jpg from the root directory of your web server.en you could unplug your network cable and refresh the page, and all of those resourceswould be available oﬄine.ASK PROFESSOR MARKUP ☞ Q: Do I need to list my HTML pages in my cae manifest? A: Yes and no. If your entire web application is contained in a single page, just make sure that page points to the cae manifest using the manifest aribute. When you navigate to an HTML page with a manifest aribute, the page itself is assumed to be part of the web application, so you don’t need to list it in the manifest ﬁle itself. However, if your web application spans multiple pages, you should list all of the HTML pages in the manifest ﬁle, otherwise the browser would not know that there are other HTML pages that need to be downloaded and caed.diveintohtml5.org LET’S TAKE THIS OFFLINE

169.
NETWORK SECTIONSHere is a slightly more complicated example. Suppose you want your clo application totra visitors, using a traing.cgi script that is loaded dynamically from an <img src>aribute. Caing this resource would defeat the purpose of traing, so this resource shouldnever be caed and never be available oﬄine. Here is how you do that: CACHE MANIFEST NETWORK: /tracking.cgi CACHE: /clock.css /clock.js /clock-face.jpgis cae manifest ﬁle includes section headers. e line marked NETWORK: is the beginningof the “online whitelist” section. Resources in this section are never caed and are notavailable oﬄine. (Aempting to load them while oﬄine will result in an error.) e linemarked CACHE: is the beginning of the “explicit” section. e rest of the cae manifest ﬁleis the same as the previous example. Ea of the three resources listed will be caed andavailable oﬄine. FALLBACK SECTIONSere is one more type of section in a cae manifest ﬁle: a fallba section. In a fallbasection, you can deﬁne substitutions for online resources that, for whatever reason, can’t becaed or weren’t caed successfully. e HTML5 speciﬁcation oﬀers this clever example ofusing a fallba section: CACHE MANIFEST FALLBACK: / /offline.html NETWORK: *diveintohtml5.org LET’S TAKE THIS OFFLINE

170.
What does this do? First, consider a site that contains millions of pages, like Wikipedia. Youcouldn’t possibly download the entire site, nor would you want to. But suppose you couldmake part of it available oﬄine. But how would you decide whi pages to cae? Howabout this: every page you ever look at on a hypothetical oﬄine-enabled Wikipedia would bedownloaded and caed. at would include every encyclopedia entry that you ever visited,every talk page (where you can have makeshi discussions about a particular encyclopediaentry), and every edit page (whi you can actually make anges to the particular entry).at’s what this cae manifest does. Suppose every HTML page (entry, talk page, edit page,history page) on Wikipedia pointed to this cae manifest ﬁle. When you visit any page thatpoints to a cae manifest, your browser says “hey, this page is part of an oﬄine webapplication, is it one I know about?” If your browser hasn’t ever downloaded this particularcae manifest ﬁle, it will set up a new oﬄine “appcae” (short for “application cae”),download all the resources listed in the cae manifest, and then add the current page to theappcae. If your browser does know about this cae manifest, it will simply add the currentpage to the existing appcae. Either way, the page you just visited ends up in the appcae.is is important. It means that you can have an oﬄine web application that “lazily” addspages as you visit them. You don’t need to list every single one of your HTML pages in yourcae manifest.Now look at the fallba section. e fallba section in this cae manifest only has a singleline. e ﬁrst part of the line (before the space) is not a URL. It’s really a URL paern. esingle aracter (/) will mat any page on your site, not just the home page. When you tryto visit a page while you’re oﬄine, your browser will look for it in the appcae. If yourbrowser ﬁnds the page in the appcae (because you visited it while online, and the page wasimplicitly added to the appcae at that time), then your browser will display the caed copyof the page. If your browser doesn’t ﬁnd the page in the appcae, instead of displaying anerror message, it will display the page /offline.html, as speciﬁed in the second half ofthat line in the fallba section.Finally, let’s examine the network section. e network section in this cae manifest also hasjust a single line, a line that contains just a single aracter (*). is aracter has specialmeaning in a network section. It’s called the “online whitelist wildcard ﬂag.” at’s a fancyway of saying that anything that isn’t in the appcae can still be downloaded from theoriginal web address, as long as you have an internet connection. is is important for an“open-ended” oﬄine web application. It means that, while you’re browsing this hypotheticaldiveintohtml5.org LET’S TAKE THIS OFFLINE

171.
oﬄine-enabled Wikipedia online, your browser will fet images and videos and otherembedded resources normally, even if they are on a diﬀerent domain. (is is common inlarge websites, even if they aren’t part of an oﬄine web application. HTML pages aregenerated and served locally, while images and videos are served from a CDN on anotherdomain.) Without this wildcard ﬂag, our hypothetical oﬄine-enabled Wikipedia would behavestrangely when you were online — speciﬁcally, it wouldn’t load any externally-hosted imagesor videos!Is this example complete? No. Wikipedia is more than HTML ﬁles. It uses common CSS,JavaScript, and images on ea page. Ea of these resources would need to be listedexplicitly in the CACHE: section of the manifest ﬁle, in order for pages to display and behaveproperly oﬄine. But the point of the fallba section is that you can have an “open-ended”oﬄine web application that extends beyond the resources you’ve listed explicitly in themanifest ﬁle. ❧ THE FLOW OF EVENTSSo far, I’ve talked about oﬄine web applications, the cae manifest, and the oﬄineapplication cae (“appcae”) in vague, semi-magical terms. ings are downloaded, browsersmake decisions, and everything Just Works. You know beer than that, right? I mean, this isweb development we’re talking about. Nothing ever Just Works.First, let’s talk about the ﬂow of events. Speciﬁcally, DOM events. When your browser visitsa page that points to a cae manifest, it ﬁres oﬀ a series of events on thewindow.applicationCache object. I know this looks complicated, but trust me, this is thesimplest version I could come up with that didn’t leave out important information. 1. As soon as it notices a manifest aribute on the <html> element, your browser ﬁres a checking event. (All the events listed here are ﬁred on the window.applicationCache object.) e checking event is always ﬁred, regardless of whether you have previously visited this page or any other page that points to thediveintohtml5.org LET’S TAKE THIS OFFLINE

172.
same cae manifest. 2. If your browser has never seen this cae manifest before… It will ﬁre a downloading event, then start to download the resources listed in the cae manifest. While it’s downloading, your browser will periodically ﬁre progress events, whi contain information on how many ﬁles have been downloaded already and how many ﬁles are still queued to be downloaded. Aer all resources listed in the cae manifest have been downloaded successfully, the browser ﬁres one ﬁnal event, cached. is is your signal that the oﬄine web application is fully caed and ready to be used oﬄine. at’s it; you’re done. 3. On the other hand, if you have previously visited this page or any other page that points to the same cae manifest, then your browser already knows about this cae manifest. It may already have some resources in the appcae. It may have the entire working oﬄine web application in the appcae. So now the question is, has the cae manifest anged since the last time your browser eed it? If the answer is no, the cae manifest has not anged, your browser will immediately ﬁre a noupdate event. at’s it; you’re done. If the answer is yes, the cae manifest has anged, your browser will ﬁre a downloading event and start re-downloading every single resource listed in the cae manifest. While it’s downloading, your browser will periodically ﬁre progress events, whi contain information on how many ﬁles have been downloaded already and how many ﬁles are still queued to be downloaded. Aer all resources listed in the cae manifest have been re-downloaded successfully, the browser ﬁres one ﬁnal event, updateready. is is your signal that the new version of your oﬄine web application is fully caed and ready to be used oﬄine. e new version is not yet in use. To “hot-swap” to the new version without forcing the user to reload the page, you can manually call the window.applicationCache.swapCache() function.If, at any point in this process, something goes horribly wrong, your browser will ﬁre anerror event and stop. Here is a hopelessly abbreviated list of things that could go wrong:diveintohtml5.org LET’S TAKE THIS OFFLINE

173.
e cae manifest returned an HTTP error 404 (Page Not Found) or 410 (Permanently Gone). e cae manifest was found and hadn’t anged, but the HTML page that pointed to the manifest failed to download properly. e cae manifest anged while the update was being run. e cae manifest was found and had anged, but the browser failed to download one of the resources listed in the cae manifest. THE FINE ART OF DEBUGGING, A.K.A. “KILL ME! KILL ME NOW!”I want to call out two important points here. e ﬁrst is something you just read, but I bet itdidn’t really sink in, so here it is again: if even a single resource listed in your caemanifest ﬁle fails to download properly, the entire process of caing your oﬄine webapplication will fail. Your browser will ﬁre the error event, but there is no indication ofwhat the actual problem was. is can make debugging oﬄine web applications even morefrustrating than usual.e second important point is something that is not, tenically speaking, an error, but it willlook like a serious browser bug until you realize what’s going on. It has to do with exactlyhow your browser es whether a cae manifest ﬁle has anged. is is a three-phaseprocess. is is boring but important, so pay aention. 1. Via normal HTTP semantics, your browser will e whether the cae manifest has expired. Just like any other ﬁle being served over HTTP, your web server will typically include meta-information about the ﬁle in the HTTP response headers. Some of these HTTP headers (Expires and Cache-Control) tell your browser how it is allowed to cae the ﬁle without ever asking the server whether it has anged. is kind of caing has nothing to do with oﬄine web applications. It happens for prey mu every HTML page, stylesheet, script, image, or other resource on the web. 2. If the cae manifest has expired (according to its HTTP headers), then your browser will ask the server whether there is a new version, and if so, the browser will download it. To do this, your browser issues an HTTP request that includes that last-modiﬁed date of the cae manifest, whi your web server included in the HTTP response headersdiveintohtml5.org LET’S TAKE THIS OFFLINE

174.
the last time your browser downloaded the manifest ﬁle. If the web server determines that the manifest ﬁle hasn’t anged since that date, it will simply return a 304 (Not Modified) status. Again, none of this is speciﬁc to oﬄine web applications. is happens for essentially every kind of resource on the web. 3. If the web server thinks the manifest ﬁle has anged since that date, it will return an HTTP 200 (OK) status code, followed by the contents of the new ﬁle, along with new Cache-Control headers and a new last-modiﬁed date, so that steps 1 and 2 will work properly the next time. (HTTP is cool; web servers are always planning for the future. If your web server absolutely must send you a ﬁle, it does everything it can to ensure that it doesn’t need to send it twice for no reason.) Once it’s downloaded the new cae manifest ﬁle, your browser will e the contents against the copy it downloaded last time. If the contents of the cae manifest ﬁle are the same as they were last time, your browser won’t re-download any of the resources listed in the manifest.Any one of these steps can trip you up while you’re developing and testing your oﬄine webapplication. For example, say you deploy one version of your cae manifest ﬁle, then 10minutes later, you realize you need to add another resource to it. No problem, right? Just addanother line and redeploy. Bzzt. Here’s what will happen: you reload the page, your browsernotices the manifest aribute, it ﬁres the checking event, and then… nothing. Yourbrowser stubbornly insists that the cae manifest ﬁle has not anged. Why? Because, bydefault, your web server is probably conﬁgured to tell browsers to cae static ﬁles for a fewhours (via HTTP semantics, using Cache-Control headers). at means your browser willnever get past step 1 of that three-phase process. Sure, the web server knows that the ﬁle hasanged, but your browser never even gets around to asking the web server. Why? Becausethe last time your browser downloaded the cae manifest, the web server told it to cae theresource for a few hours (via HTTP semantics, using Cache-Control headers). And now, 10minutes later, that’s exactly what your browser is doing.To be clear, this is not a bug, it’s a feature. Everything is working exactly the way it’ssupposed to. If web servers didn’t have a way to tell browsers (and intermediate proxies) tocae things, the web would collapse overnight. But that’s no comfort to you aer you spenda few hours trying to ﬁgure out why your browser won’t notice your updated cae manifest.(And even beer, if you wait long enough, it will mysteriously starts working again! Becausethe HTTP cae expired! Just like it’s supposed to! Kill me! Kill me now!)So here’s one thing you should absolutely do: reconﬁgure your web server so that your caediveintohtml5.org LET’S TAKE THIS OFFLINE

175.
manifest ﬁle is not caeable by HTTP semantics. If you’re running an Apae-based webserver, these two lines in your .htaccess ﬁle will do the tri: ExpiresActive On ExpiresDefault "access"at will actually disable caing for every ﬁle in that directory and all subdirectories. at’sprobably not what you want in production, so you should either qualify this with a <Files>directive so it only aﬀects your cae manifest ﬁle, or create a subdirectory that containsnothing but this .htaccess ﬁle and your cae manifest ﬁle. As usual, conﬁguration detailsvary by web server, so consult your server’s documentation for how to control HTTP caingheaders.Once you’ve disabled HTTP caing on the cae manifest ﬁle itself, you’ll still have timeswhere you’ve anged one of the resources in the appcae, but it’s still at the same URL onyour web server. Here, step 2 of the three-phase process will screw you. If your caemanifest ﬁle hasn’t anged, the browser will never notice that one of the previously caedresources has anged. Consider the following example: CACHE MANIFEST # rev 42 clock.js clock.cssIf you ange clock.css and redeploy it, you won’t see the anges, because the caemanifest ﬁle itself hasn’t anged. Every time you make a ange to one of the resources inyour oﬄine web application, you’ll need to ange the cae manifest ﬁle itself. is can beas simple as anging a single aracter. e easiest way I’ve found to accomplish this is toinclude a comment line with a revision number. Change the revision number in the comment,then the web server will return the newly anged cae manifest ﬁle, your browser willnotice that the contents of the ﬁle have anged, and it will ki oﬀ the process to re-download all the resources listed in the manifest. CACHE MANIFEST # rev 43 clock.jsdiveintohtml5.org LET’S TAKE THIS OFFLINE

176.
clock.css ❧ LET’S BUILD ONE!Remember the Halma game that we introduced in the canvas apter and later improved bysaving state with persistent local storage ? Let’s take our Halma game oﬄine.To do that, we need a manifest that lists all the resources the game needs. Well, there’s themain HMTL page, a single JavaScript ﬁle that contains all the game code, and… that’s it.ere are no images, because all the drawing is done programmatically via the canvas API.All the necessary CSS styles are in a <style> element at the top of the HTML page. So thisis our cae manifest: CACHE MANIFEST halma.html ../halma-localstorage.jsA word about paths. I’ve created an offline/ subdirectory in the examples/ directory, andthis cae manifest ﬁle lives inside the subdirectory. Because the HTML page will need oneminor addition to work oﬄine (more on that in a minute), I’ve created a separate copy of theHTML ﬁle, whi also lives in the offline/ subdirectory. But because there are no angesto the JavaScript code itself since we added local storage support, I’m literally reusing thesame .js ﬁle, whi lives in the parent directory ( examples/). Altogether, the ﬁles look likethis: /examples/localstorage-halma.html /examples/halma-localstorage.js /examples/offline/halma.manifest /examples/offline/halma.htmlIn the cae manifest ﬁle ( /examples/offline/halma.manifest), we want to referencetwo ﬁles. First, the oﬄine version of the HTML ﬁle (/examples/offline/halma.html).diveintohtml5.org LET’S TAKE THIS OFFLINE

177.
two ﬁles. First, the oﬄine version of the HTML ﬁle (/examples/offline/halma.html).Since these two ﬁles are in the same directory, it is listed in the manifest ﬁle without anypath preﬁx. Second, the JavaScript ﬁle whi lives in the parent directory(/examples/halma-localstorage.js). is is listed in the manifest ﬁle using relativeURL notation: ../halma-localstorage.js. is is just like you might use a relative URLin an <img src> aribute. As you’ll see in the next example, you can also use absolutepaths (that start at the root of the current domain) or even absolute URLs (that point toresources on other domains).Now, in the HTML ﬁle, we need to add the manifest aribute that points to the caemanifest ﬁle. <!DOCTYPE html> <html lang="en" manifest="halma.manifest">And that’s it! When an oﬄine-capable browser ﬁrst loads the oﬄine-enabled HTML page, itwill download the linked cae manifest ﬁle and start downloading all the referencedresources and storing them in the oﬄine application cae. From then on, the oﬄineapplication algorithm takes over whenever you revisit the page. You can play the gameoﬄine, and since it remembers its state locally, you can leave and come ba as oen as youlike. ❧ FURTHER READINGStandards: Oﬄine web applications in the HTML5 speciﬁcationBrowser vendor documentation: Oﬄine resources in Firefox HTML5 oﬄine application cae, part of the Safari client-side storage and oﬄinediveintohtml5.org LET’S TAKE THIS OFFLINE

178.
applications programming guideTutorials and demos: Gmail for mobile HTML5 series: using appcae to laun oﬄine - part 1 Gmail for mobile HTML5 series: using appcae to laun oﬄine - part 2 Gmail for mobile HTML5 series: using appcae to laun oﬄine - part 3 Debugging HTML5 oﬄine application cae an HTML5 oﬄine image editor and uploader application ❧is has been “Let’s Take is Oﬄine.” e full table of contents has more if you’d like tokeep reading.DID YOU KNOW? In association with Google Press, O’Reilly is distributing this book in a variety of formats, including paper, ePub, Mobi, and DRM-free PDF. e paid edition is called “HTML5: Up & Running,” and it is available now. is apter is included in the paid edition. If you liked this apter and want to show your appreciation, you can buy “HTML5: Up & Running” with this aﬃliate link or buy an electronic edition directly from O’Reilly. You’ll get a book, and I’ll get a bu. I do not currently accept direct donations. Copyright MMIX–MMX Mark Pilgrimdiveintohtml5.org LET’S TAKE THIS OFFLINE

179.
powered by Google™ Searchdiveintohtml5.org LET’S TAKE THIS OFFLINE

180.
You are here: Home ‣ Dive Into HTML5 ‣ №9 . A FORM OF MADNESS show table of contents ❧ DIVING IN verybody knows about web forms, right? Make a <form>, a few <input type="text"> elements, maybe an <input type="password"> , ﬁnish it oﬀ with an <input type="submit"> buon, and you’re done.You don’t know the half of it. HTML5 deﬁnes over a dozen new input types that you can usein your forms. And when I say “use,” I mean you can use them right now — without anyshims, has, or workarounds. Now don’t get too excited; I don’t mean to say that all of theseexciting new features are actually supported in every browser. Oh goodness no, I don’t meanthat at all. In modern browsers, yes, your forms will ki all kinds of ass. But in legacybrowsers, your forms will still work, albeit with less ass kiing. Whi is to say, all of thesefeatures degrade gracefully in every browser. Even IE 6. ❧diveintohtml5.org A FORM OF MADNESS

181.
PLACEHOLDER TEXT PLACEHOLDER SUPPORTIE FIREFOX SAFARI CHROME OPERA IPHONE ANDROID · 3.7+ 4.0+ 4.0+ · 4.0+ ·e ﬁrst improvement HTML5 brings to web forms is the ability to set placeholder text in aninput ﬁeld. Placeholder text is displayed inside the input ﬁeld as long as the ﬁeld is emptyand not focused. As soon as you cli on (or tab to) the input ﬁeld, the placeholder textdisappears.You’ve probably seen placeholder text before. For example, Mozilla Firefox includesplaceholder text in the location bar that reads “Sear Bookmarks and History”:When you cli on (or tab to) the location bar, the placeholder text disappears:Here’s how you can include placeholder text in your own web forms: <form> <input name="q" placeholder="Search Bookmarks and History"> <input type="submit" value="Search"> </form>Browsers that don’t support the placeholder aribute will simply ignore it. No harm, nofoul. See whether your browser supports placeholder text .ASK PROFESSOR MARKUP ☞ Q: Can I use HTML markup in thediveintohtml5.org A FORM OF MADNESS

182.
☞ placeholder aribute? I want to insert an image, or maybe ange the colors. A: e placeholder aribute can only contain text, not HTML markup. However, there are some vendor-speciﬁc CSS extensions that allow you to style the placeholder text in some browsers. ❧ AUTOFOCUS FIELDS AUTOFOCUS SUPPORTIE FIREFOX SAFARI CHROME OPERA IPHONE ANDROID · · 4.0+ 3.0+ 10.0+ · ·Web sites can use JavaScript to focus the ﬁrst input ﬁeld of aweb form automatically. For example, the home page ofGoogle.com will autofocus the input box so you can type yoursear keywords. While this is convenient for most people, itcan be annoying for power users or people with special needs. Ifyou press the space bar expecting to scroll the page, the pagewill not scroll because the focus is already in a form input ﬁeld.(It types a space in the ﬁeld instead of scrolling.) If you focus adiﬀerent input ﬁeld while the page is still loading, the site’sautofocus script may “helpfully” move the focus ba to theoriginal input ﬁeld, disrupting your ﬂow and causing you to type in the wrong place.Because the autofocusing is done with JavaScript, it can be triy to handle all of these edgecases, and there is lile recourse for people who don’t want a web page to “steal” the focus.diveintohtml5.org A FORM OF MADNESS

183.
To solve this problem, HTML5 introduces an autofocus aribute on all web form controls.e autofocus aribute does exactly what it says on the tin: as soon as the page loads, itmoves the input focus to a particular input ﬁeld. But because it’s just markup instead ofscript, the behavior will be consistent across all web sites. Also, browser vendors (orextension authors) can oﬀer users a way to disable the autofocusing behavior.Here’s how you can set a form ﬁeld to autofocus: <form> <input name="q" autofocus> <input type="submit" value="Search"> </form>Browsers that don’t support the autofocus aribute will simply ignore it. See whether yourbrowser supports autofocus ﬁelds.What’s that? You say you want your autofocus ﬁelds to work in all browsers, not just thesefancy-pants HTML5 browsers? You can keep your current autofocus script. Just make twosmall anges: 1. Add the autofocus aribute to your HTML markup 2. Detect whether the browser supports the autofocus aribute, and only run your own autofocus script if the browser doesn’t support autofocus natively. ↶ Autofocus with fallback <form name="f"> <input id="q" autofocus> <script> if (!("autofocus" in document.createElement("input"))) { document.getElementById("q").focus(); } </script> <input type="submit" value="Go"> </form>diveintohtml5.org A FORM OF MADNESS

184.
…See an example of autofocus with fallba. SETTING FOCUS AS EARLY AS POSSIBLELots of web pages wait until window.onload ﬁres to set focus. But the window.onloadevent doesn’t ﬁre until aer all your images have loaded. If your page has a lot of images,su a naive script could potentially re-focus the ﬁeld aer the user has started interactingwith another part of your page. is is why power users hate autofocus scripts.e example in the previous section placed the auto-focus script immediately aer the formﬁeld that it references. is is the optimal solution, but it may oﬀend your sensibilities to puta blo of JavaScript code in the middle of your page. (Or, more mundanely, your ba-endsystems may just not be that ﬂexible.) If you can’t insert a script in the middle of your page,you should set focus during a custom event like jery’s $(document).ready() instead ofwindow.onload . ↶ Autofocus with jQuery fallback <head> <script src=jquery.min.js></script> <script> $(document).ready(function() { if (!("autofocus" in document.createElement("input"))) { $("#q").focus(); } }); </script> </head> <body> <form name="f"> <input id="q" autofocus> <input type="submit" value="Go"> </form>diveintohtml5.org A FORM OF MADNESS

185.
See an example of autofocus with jery fallba.jery ﬁres its custom ready event as soon as the page DOM is available — that is, it waitsuntil the page text is loaded, but it doesn’t wait until all the images are loaded. is is not anoptimal approa — if the page is unusually large or the network connection is unusuallyslow, a user could still start interacting with the page before your focus script executes. But itis still far beer than waiting until the window.onload event ﬁres.If you are willing and able to insert a single script statement in your page markup, there is amiddle ground that is less oﬀensive than the ﬁrst option and more optimal than the second.You can use jery’s custom events to deﬁne your own event, say autofocus_ready. enyou can trigger this event manually, immediately aer the autofocus form ﬁeld is available.anks to E. M. Sternberg for teaing me about this tenique. ↶ Autofocus with a custom event fallback <head> <script src=jquery.min.js></script> <script> $(document).bind(autofocus_ready, function() { if (!("autofocus" in document.createElement("input"))) { $("#q").focus(); } }); </script> </head> <body> <form name="f"> <input id="q" autofocus> <script>$(document).trigger(autofocus_ready);</script> <input type="submit" value="Go"> </form>See an example of autofocus with a custom event fallba .diveintohtml5.org A FORM OF MADNESS

186.
is is as optimal as the ﬁrst approa; it will set focus to the form ﬁeld as soon astenically possible, while the text of the page is still loading. But it transfers the bulk ofyour application logic (focusing the form ﬁeld) out of the body of the page and into the head.is example relies on jery, but the concept of custom events is not unique to jery.Other JavaScript libraries like YUI and Dojo oﬀer similar capabilities.To sum up: Seing focus properly is important. If at all possible, let the browser do it by seing the autofocus aribute on the form ﬁeld you want to have focus. If you code a fallba for older browsers, detect support for the autofocus aribute to make sure your fallba is only executed in older browsers. Set focus as early as possible. Insert the focus script into your markup immediately aer the form ﬁeld. If that oﬀends you, use a JavaScript library that supports custom events, and trigger a custom event immediately aer the form ﬁeld markup. If that’s not possible, use something like jery’s $(document).ready() event. Under no circumstances should you wait until window.onload to set focus. ❧ EMAIL ADDRESSESFor over a decade, web forms comprised just a few kinds of ﬁelds. e most common kindswere Field Type HTML Code Notes ebox <input type="checkbox"> can be toggled on or oﬀ radio buon <input type="radio"> can be grouped with other inputs password ﬁeld <input type="password"> eos dots instead of aracters as you type drop-down lists <select><option>… ﬁle pier <input type="file"> pops up an “open ﬁle” dialogdiveintohtml5.org A FORM OF MADNESS

187.
submit buon <input type="submit"> plain text <input type="text"> the type aribute can be omiedAll of these input types still work in HTML5. If you’re “upgrading to HTML5” (perhaps byanging your DOCTYPE), you don’t need to make a single ange to your web forms.Hooray for baward compatibility!However, HTML5 deﬁnes 13 new ﬁeld types, and for reasons that will become clear in amoment, there is no reason not to start using them.e ﬁrst of these new input types is for email addresses. It looks like this: <form> <input type="email"> <input type="submit" value="Go"> </form>I was about to write a sentence that started with “in browsers that don’t supporttype="email" …” but I stopped myself. Why? Because I’m not sure what it would mean tosay that a browser doesn’t support type="email". All browsers “support” type="email".ey may not do anything special with it (you’ll see a few examples of special treatment in amoment), but browsers that don’t recognize type="email" will treat it as type="text"and render it as a plain text ﬁeld.I can not emphasize how important this is. e web has millions of forms that ask you toenter an email address, and all of them use <input type="text">. You see a text box,you type your email address in the text box, and that’s that. Along comes HTML5, whideﬁnes type="email". Do browsers freak out? No. Every single browser on Earth treats anunknown type aribute as type="text" — even IE 6. So you can “upgrade” your webforms to use type="email" right now.What would it mean to say that a browser DID support type="email"? Well, it can meanany number of things. e HTML5 speciﬁcation doesn’t mandate any particular user interfacefor the new input types. Opera styles the form ﬁeld with a small email icon. Other HTML5browsers like Safari and Chrome simply render it as a text box — exactly like type="text"— so your users will never know the diﬀerence (unless they view-source).diveintohtml5.org A FORM OF MADNESS

188.
And then there’s the iPhone.e iPhone does not have a physical keyboard. All “typing” is done by tapping on an on-screen keyboard that pops up at appropriate times, like when you focus a form ﬁeld in a webpage. Apple did something clever in the iPhone’s web browser. It recognizes several of thenew HTML5 input types, and dynamically anges the on-screen keyboard to optimize forthat kind of input.For example, email addresses are text, right? Sure, but they’re a special kind of text. Forexample, virtually all email addresses contain the @ sign and at least one period ( .), butthey’re unlikely to contain any spaces. So when you use an iPhone and focus an <inputtype="email"> element, you get an on-screen keyboard that contains a smaller-than-usualspace bar, plus dedicated keys for the @ and . aracters. Test type="email" for yourself.To sum up: there’s no downside to converting all your email address form ﬁelds totype="email" immediately. Virtually no one will even notice, except iPhone users, whoprobably won’t notice either. But the ones who do notice will smile quietly and thank you formaking their web experience just a lile easier. ❧diveintohtml5.org A FORM OF MADNESS

189.
WEB ADDRESSESWeb addresses — whi standards wonks call URLs, except for a few pedants whi call themURIs — are another type of specialized text. e syntax of a web address is constrained by therelevant Internet standards. If someone asks you to enter a web address into a form, they’reexpecting something like “http://www.google.com/”, not “125 Farwood Road.” Forwardslashes are common — even Google’s home page has three of them. Periods are also common,but spaces are forbidden. And every web address has a domain suﬃx like “.com” or “.org”.Behold… (drum roll please)… <input type="url">. On the iPhone, it looks like this: Test type="url" for yourself.e iPhone altered its virtual keyboard, just like it did for email addresses, but now optimizedfor web addresses instead. e space bar has been completely replaced with three virtual keys:a period, a forward slash, and a “.com” buon. (You can long-press the “.com” buon tooose other common suﬃxes like “.org” or “.net”.)Browsers that don’t support HTML5 will treat type="url" exactly like type="text", sothere’s no downside to using it for all your web-address-inpuing needs.diveintohtml5.org A FORM OF MADNESS

190.
❧ NUMBERS AS SPINBOXESNext up: numbers. Asking for a number is triier than asking for an email address or webaddress. First of all, numbers are more complicated than you might think. i: pi anumber. -1? No, I meant a number between 1 and 10. 7½? No no, not a fraction, silly. π?Now you’re just being irrational.My point is, you don’t oen ask for “just a number.” It’s more likely that you’ll ask for anumber in a particular range. You may only want certain kinds of numbers within that range— maybe whole numbers but not fractions or decimals, or something more esoteric likenumbers divisible by 10. HTML5 has you covered. ↶ Pick a number, (almost) any number <input type="number" min="0" max="10" step="2" value="6">Let’s take that one aribute at a time. (You can follow along with a live example if you like.) type="number" means that this is a number ﬁeld. min="0" speciﬁes the minimum acceptable value for this ﬁeld. max="10" is the maximum acceptable value. step="2" , combined with the min value, deﬁnes the acceptable numbers in the range: 0, 2, 4 , and so on, up to the max value. value="6" is the default value. is should look familiar. It’s the same aribute name you’ve always used to specify values for form ﬁelds. (I mention it here to drive home the point that HTML5 builds on previous versions of HTML. You don’t need to relearn how to do stuﬀ you’re already doing.)diveintohtml5.org A FORM OF MADNESS

191.
at’s the markup side of a number ﬁeld. Keep in mind that all of those aributes areoptional. If you have a minimum but no maximum, you can specify a min aribute but nomax aribute. e default step value is 1, and you can omit the step aribute unless youneed a diﬀerent step value. If there’s no default value, then the value aribute can be theempty string or even omied altogether.But HTML5 doesn’t stop there. For the same low, low price of free, you get these handyJavaScript methods as well: input.stepUp(n) increases the ﬁeld’s value by n . input.stepDown(n) decreases the ﬁeld’s value by n . input.valueAsNumber returns the current value as a ﬂoating point number. (e input.value property is always a string.)Having trouble visualizing it? Well, the exact interface of a number control is up to yourbrowser, and diﬀerent browser vendors have implemented support in diﬀerent ways. On theiPhone, where input is diﬃcult to begin with, the browser once again optimizes the virtualkeyboard for numeric input.In the desktop version of Opera, the same type="number" ﬁeld is rendered as a “spinbox”control, with lile up and down arrows that you can cli to ange the value.diveintohtml5.org A FORM OF MADNESS

192.
Opera respects the min, max, and step aributes, so you’ll always end up with an acceptablenumeric value. If you bump up the value to the maximum, the up arrow in the spinbox isgreyed out.As with all the other input types I’ve discussed in this apter, browsers that don’t supporttype="number" will treat it as type="text" . e default value will show up in the ﬁeld(since it’s stored in the value aribute), but the other aributes like min and max will beignored. You’re free to implement them yourself, or you could reuse a JavaScript frameworkthat has already implemented spinbox controls. Just e for the native HTML5 support ﬁrst,like this: if (!Modernizr.inputtypes.number) { // no native support for type=number fields // maybe try Dojo or some other JavaScript framework } ❧ NUMBERS AS SLIDERSSpinboxes are not the only way to represent numeric input. You’ve probably also seen “slider”controls that look like this:diveintohtml5.org A FORM OF MADNESS

193.
Test type="range" for yourself.You can now have slider controls in your web forms, too. e markup looks eerily similar tospinbox controls: ↶ The spitting image <input type="range" min="0" max="10" step="2" value="6">All the available aributes are the same as type="number" — min, max, step, value —and they mean the same thing. e only diﬀerence is the user interface. Instead of a ﬁeld fortyping, browsers are expected to render type="range" as a slider control. At time ofwriting, the latest versions of Safari, Chrome, and Opera all do this. (Sadly, the iPhone rendersit as a simple text box. It doesn’t even optimize its on-screen keyboard for numeric input.)All other browsers simply treat the ﬁeld as type="text", so there’s no reason you can’tstart using it immediately. ❧ DATE PICKERSHTML 4 did not include a date pier control. JavaScript frameworks have pied up the sla(Dojo, jery UI, YUI, Closure Library), but of course ea of these solutions requires “buyinginto” the framework on whi the date pier is built.HTML5 ﬁnally deﬁnes a way to include a native date pier control without having to scriptit yourself. In fact, it deﬁnes six: date, month, week, time, date + time, and date + time -diveintohtml5.org A FORM OF MADNESS

194.
timezone.So far, support is… sparse. DATE PICKER SUPPORTINPUT TYPE OPERA EVERY OTHER BROWSERtype="date" 9.0+ ·type="month" 9.0+ ·type="week" 9.0+ ·type="time" 9.0+ ·type="datetime" 9.0+ ·type="datetime-local" 9.0+ ·is is how Opera renders <input type="date">:If you need a time to go with that date, Opera also supports <input type="datetime">:If you only need a month + year (perhaps a credit card expiration date), Opera can render a<input type="month"> :diveintohtml5.org A FORM OF MADNESS

195.
Less common, but also available, is the ability to pi a speciﬁc week of a year with <inputtype="week"> :Last but not least, you can pi a time with <input type="time">:It’s likely that other browsers will eventually support these input types. But just liketype="email" and the other input types, these form ﬁelds will be rendered as plain textboxes in browsers that don’t recognize type="date" and the other variants. If you like, youcan simply use <input type="date"> and friends, make Opera users happy, and wait forother browsers to cat up. More realistically, you can use <input type="date">, detectwhether the browser has native support for date piers, and fall ba to a scripted solutionof your oice (Dojo, jery UI, YUI, Closure Library, or some other solution). ↶ Date picker with fallback <form> <input type="date"> </form>diveintohtml5.org A FORM OF MADNESS

196.
... <script> var i = document.createElement("input"); i.setAttribute("type", "date"); if (i.type == "text") { // No native date picker support :( // Use Dojo/jQueryUI/YUI/Closure to create one, // then dynamically replace that <input> element. } </script> ❧ SEARCH BOXESOK, this one is subtle. Well, the idea is simple enough, but the implementations may requiresome explanation. Here goes…Sear. Not just Google Sear or Yahoo Sear. (Well, those too.) ink of any sear box,on any page, on any site. Amazon has a sear box. Newegg has a sear box. Most blogshave a sear box. How are they marked up? <input type="text">, just like every othertext box on the web. Let’s ﬁx that. <form> <input name="q" type="search"> <input type="submit" value="Find"> ⇜ New-age search </form> boxTest <input type="search"> in your own browser . In some browsers, you won’t noticeany diﬀerence from a regular text box. But if you’re using Safari on Mac OS X, it will looklike this:diveintohtml5.org A FORM OF MADNESS

197.
Can you spot the diﬀerence? e input box has rounded corners! I know, I know, you canhardly contain your excitement. But wait, there’s more! When you actually start typing intothe type="search" box, Safari inserts a small “x” buon on the right side of the box.Cliing the “x” clears the contents of the ﬁeld. (Google Chrome, whi shares mutenology with Safari under the hood, also exhibits this behavior.) Both of these smalltweaks are done to mat the look and feel of native sear boxes in iTunes and other MacOS X client applications.Apple.com uses <input type="search"> for their site-sear box, to help give their site a“Mac-like” feel. But there’s nothing Mac-speciﬁc about it. It’s just markup, so ea browser onea platform can oose to render it according to platform-speciﬁc conventions. As with allthe other new input types, browsers that don’t recognize type="search" will treat it liketype="text", so there is absolutely no reason not to start using type="search" for allyour sear boxes today.PROFESSOR MARKUP SAYS By default, Safari will not apply even the most basic CSS styles to <input type="search"> ﬁelds. If you want to force Safari to treat your sear ﬁeld like a normal text ﬁeld (so you can apply your own CSS styles), add this rule to your stylesheet:diveintohtml5.org A FORM OF MADNESS

198.
input[type="search"] { -webkit-appearance: textfield; } anks to John Lein for teaing me this tri. ❧ COLOR PICKERSHTML5 also deﬁnes <input type="color">, whi lets you pi a color and returns thecolor’s hexadecimal representation. No browser supports it yet, whi is a shame, because I’vealways loved the Mac OS color pier . Maybe someday. ❧ FORM VALIDATION FORM VALIDATION SUPPORTIE FIREFOX SAFARI CHROME OPERA IPHONE ANDROID · 4.0+ 5.0+ 6.0+ 9.0+ · ·In this apter, I’ve talked about new input types and new features like auto-focus form ﬁelds,but I haven’t mentioned what is perhaps the most exciting part of HTML5 forms: automaticinput validation. Consider the common problem of entering an email address into a webdiveintohtml5.org A FORM OF MADNESS

199.
form. You probably have some client-side validation in JavaScript, followed by server-sidevalidation in PHP or Python or some other server-side scripting language. HTML5 can neverreplace your server-side validation, but it might someday replace your client-side validation.ere are two big problems with validating email addresses in JavaScript: 1. A surprising number of your visitors (probably around 10%) won’t have JavaScript enabled 2. You’ll get it wrongSeriously, you’ll get it wrong. Determining whether a random string of aracters is a validemail address is unbelievably complicated. e harder you look, the more complicated it gets.Did I mention it’s really, really complicated? Wouldn’t it be easier to oﬄoad the entireheadae to your browser? Opera validates type=“email” ↷at screenshot is from Opera 10, although the functionality has been present since Opera 9.e only markup involved is seing the type aribute to "email". When an Opera usertries to submit a form with an <input type="email"> ﬁeld, Opera automatically oﬀersRFC-compliant email validation, even if scripting is disabled.HTML5 also oﬀers validation of web addresses entered into <input type="url"> ﬁelds,and numbers in <input type="number"> ﬁelds. e validation of numbers even takes intoaccount the min and max aributes, so browsers will not let you submit the form if youenter a number that is too large.diveintohtml5.org A FORM OF MADNESS

200.
ere is no markup required to activate HTML5 form validation; it is on by default. To turn itoﬀ, use the novalidate aribute. Don’t validate me ↷ <form novalidate> <input type="email" id="addr"> <input type="submit" value="Subscribe"> </form>Browsers are slowly implementing support for HTML5 form validation. Firefox 4 will havecomplete support. Unfortunately, Safari and Chrome have shipped an incompleteimplementation that may trip you up: they validate form controls, but they don’t oﬀer anyvisible feedba when a form ﬁeld fails validation. In other words, if you enter an invalid (orimproperly formaed) date in a type="date" ﬁeld, Safari and Chrome will not submit theform, but they won’t tell you why they didn’t submit the form. (ey will set focus to theﬁeld that contains the invalid value, but they don’t display an error message like Opera orFirefox 4.) ❧ REQUIRED FIELDS <INPUT REQUIRED> SUPPORTIE FIREFOX SAFARI CHROME OPERA IPHONE ANDROID · 4.0+ · · 9.0+ · ·HTML5 form validation isn’t limited to the type of ea ﬁeld. You can also specify thatcertain ﬁelds are required. Required ﬁelds must have a value before you can submit the form.e markup for required ﬁelds is as simple as can be: <form>diveintohtml5.org A FORM OF MADNESS

201.
<input id="q" required> <input type="submit" value="Search"> </form>Test <input required> in your own browser . Browsers may alter the default appearanceof required ﬁelds. For example, this is what a required ﬁeld looks like in Mozilla Firefox 4.0:Furthermore, if you aempt to submit the form without ﬁlling in the required value, Firefoxwill pop up an infobar telling you that the ﬁeld is mandatory and can not be le blank. ❧ FURTHER READINGSpeciﬁcations and standards: <input> types the <input placeholder> aribute the <input autofocus> aribute the <form novalidate> aribute e <input required> aribute HTML5 Forms in Mozilla Firefox 4.0+JavaScript libraries: Modernizr, an HTML5 detection libraryUseful articles:diveintohtml5.org A FORM OF MADNESS

202.
Forward inking Form Validation ❧is has been “A Form of Madness.” e full table of contents has more if you’d like to keepreading.DID YOU KNOW? In association with Google Press, O’Reilly is distributing this book in a variety of formats, including paper, ePub, Mobi, and DRM-free PDF. e paid edition is called “HTML5: Up & Running,” and it is available now. is apter is included in the paid edition. If you liked this apter and want to show your appreciation, you can buy “HTML5: Up & Running” with this aﬃliate link or buy an electronic edition directly from O’Reilly. You’ll get a book, and I’ll get a bu. I do not currently accept direct donations. Copyright MMIX–MMX Mark Pilgrim powered by Google™ Searchdiveintohtml5.org A FORM OF MADNESS

203.
You are here: Home ‣ Dive Into HTML5 ‣ №10. 10. “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS show table of contents ❧ DIVING IN here are over 100 elements in HTML5. Some are purely semantic, others are just containers for scripted APIs. roughout the history of HTML, standards wonks have argued about whi elements should be included in the language. Should HTML include a <figure> element? A <person> element? Howabout a <rant> element? Decisions are made, specs are wrien, authors author, implementorsimplement, and the web lures ever forward.Of course, HTML can’t please everyone. No standard can. Some ideas don’t make the cut. Forexample, there is no <person> element in HTML5. (ere’s no <rant> element either, damnit!) ere’s nothing stopping you from including a <person> element in a web page, but itwon’t validate, it won’t work consistently across browsers , and it might conﬂict with futurediveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

204.
HTML specs if we want to add it later.Right, so if making up your own elements isn’t the answer, what’s a semantically inclinedweb author to do? ere have been aempts to extend previous versions of HTML. e mostpopular method is microformats, whi uses the class and rel aributes in HTML 4.Another option is RDFa, whi was originally designed to be used in XHTML but is nowbeing ported to HTML as well.Microformats and RDFa ea have their strengths and weaknesses. ey take radicallydiﬀerent approaes towards the same goal: extending web pages with additional semanticsthat are not part of the core HTML language. I don’t intend to turn this apter into a formatﬂamewar. (at would deﬁnitely require a <rant> element!) Instead, I want to focus on athird option whi is part of, and tightly integrated into, HTML5 itself: microdata. ❧ WHAT IS MICRODATA?Ea word in the following sentence is important, so pay aention.PROFESSOR MARKUP SAYS Microdata annotates the DOM with scoped name/value pairs from custom vocabularies.diveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

205.
Now what does that mean? Let’s start from the end and work bawards. Microdata centersaround custom vocabularies. ink of “the set of all HTML5 elements” as one vocabulary. isvocabulary includes elements to represent a section or an article , but it doesn’t includeelements to represent a person or an event. If you want to represent a person on a web page,you’ll need to deﬁne your own vocabulary. Microdata lets you do this. Anyone can deﬁne amicrodata vocabulary and start embedding custom properties in their own web pages.e next thing to know about microdata is that it works with name/value pairs. Everymicrodata vocabulary deﬁnes a set of named properties. For example, a Person vocabularycould deﬁne properties like name and photo. To include a speciﬁc microdata property onyour web page, you provide the property name in a speciﬁc place. Depending on where youdeclare the property name, microdata has rules about how to extract the property value. (Moreon this in the next section.)Along with named properties, microdata relies heavily on the concept of “scoping.” esimplest way to think of microdata scoping is to think about the natural parent-ildrelationship of elements in the DOM. e <html> element usually contains two ildren,<head> and <body>. e <body> element usually contains multiple ildren, ea of whimay have ild elements of their own. For example, your page might include an <h1>element within an <hgroup> element within a <header> element within the <body>element. A data table might contain <td> within <tr> within <table> (within <body>).Microdata re-uses the hierarical structure of the DOM itself to provide a way to say “all theproperties within this element are taken from this vocabulary.” is allows you to use morethan one microdata vocabulary on the same page. You can even nest microdata vocabularieswithin other vocabularies, all by re-using the natural structure of the DOM. (I’ll show multipleexamples of nested vocabularies throughout this apter.)Now, I’ve already toued on the DOM, but let me elaborate on that. Microdata is aboutapplying additional semantics to data that’s already visible on your web page . Microdata isnot designed to be a standalone data format. It’s a complement to HTML. As you’ll see in thenext section, microdata works best when you’re already using HTML correctly, but the HTMLvocabulary isn’t quite expressive enough. Microdata is great for ﬁne-tuning the semantics ofdata that’s already in the DOM. If the data you’re semanti-fying isn’t in the DOM, youshould step ba and re-evaluate whether microdata is the right solution.Does this sentence make more sense now? “Microdata annotates the DOM with scopeddiveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

206.
name/value pairs from custom vocabularies.” I hope so. Let’s see it in action. ❧ THE MICRODATA DATA MODELDeﬁning your own microdata vocabulary is easy. First, you need a namespace, whi is just aURL. e namespace URL could actually point to a working web page, although that’s notstrictly required. Let’s say I want to create a microdata vocabulary that describes a person. If Iown the data-vocabulary.org domain, I’ll use the URL http://data-vocabulary.org/Person as the namespace for my microdata vocabulary. at’s an easyway to create a globally unique identiﬁer: pi a URL on a domain that you control.In this vocabulary, I need to deﬁne some named properties. Let’s start with three basicproperties: name (your full name) photo (a link to a picture of you) url (a link to a site associated with you, like a weblog or a Google proﬁle)Some of these properties are URLs, others are plain text. Ea of them lends itself to a naturalform of markup, even before you start thinking about microdata or vocabularies or whatnot.Imagine that you have a proﬁle page or an “about” page. Your name is probably marked up asa heading, like an <h1> element. Your photo is probably an <img> element, since you wantpeople to see it. And any URLs associated your proﬁle are probably already marked up ashyperlinks, because you want people to be able to cli them. For the sake of discussion, let’ssay your entire proﬁle is also wrapped in a <section> element to separate it from the restof the page content. us: ↶ It’s all about me<section><h1>Mark Pilgrim</h1>diveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

207.
<p><img src="http://www.example.com/photo.jpg" alt="[me smiling]"></p><p><a href="http://diveintomark.org/">weblog</a></p></section>Microdata’s data model is name/value pairs. A microdata property name (like name or photoor url in this example) is always declared on an HTML element. e corresponding propertyvalue is then taken from the element’s DOM. For most HTML elements, the property value issimply the text content of the element. But there are a handful of exceptions. WHERE DO MICRODATA PROPERTY VALUES COME FROM? Element Value <meta> content aribute <audio> src aribute <embed> <iframe> <img> <source> <video> <a> href aribute <area> <link> <object> data aribute <time> datetime aribute all other elements text content“Adding microdata” to your page is a maer of adding a few aributes to the HTMLelements you already have. e ﬁrst thing you always do is declare whi microdatavocabulary you’re using, by adding an itemtype aribute. e second thing you always dois declare the scope of the vocabulary, using an itemscope aribute. In this example, all thedata we want to semanti-fy is in a <section> element, so we’ll declare the itemtype anditemscope aributes on the <section> element.<section itemscope itemtype="http://data-vocabulary.org/Person">Your name is the ﬁrst bit of data within the <section> element. It’s wrapped in an <h1>element. e <h1> element doesn’t have any special processing in the HTML5 microdata datadiveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

208.
model, so it falls under the “all other elements” rule where the microdata property value issimply the text content of an element. (is would work equally well if your name waswrapped in a <p>, <div>, or <span> element.)<h1 itemprop="name">Mark Pilgrim</h1>In English, this says “here is the name property of the http://data-vocabulary.org/Person vocabulary, and the value of the property is Mark Pilgrim.”Next up: the photo property. is is supposed to be a URL. According to the HTML5microdata data model, the “value” of an <img> element is its src aribute. Hey look, theURL of your proﬁle photo is already in an <img src> aribute. All you need to do isdeclare that the <img> element is the photo property.<p><img itemprop="photo"src="http://www.example.com/photo.jpg"alt="[me smiling]"></p>In English, this says “here is the photo property of the http://data-vocabulary.org/Person vocabulary, and the value of the property ishttp://www.example.com/photo.jpg.Finally, the url property is also a URL. According to the HTML5 microdata data model, the“value” of an <a> element is its href aribute. And once again, this ﬁts perfectly with yourexisting markup. All you need to do is say that your existing <a> element is the urlproperty:<a itemprop="url" href="http://diveintomark.org/">dive into mark</a>In English, this says “here is the url property of the http://data-vocabulary.org/Person vocabulary, and the value of the property ishttp://diveintomark.org/.Of course, if your markup looks a lile diﬀerent, that’s not a problem. You can add microdataproperties and values to any HTML markup, even really gnarly 20th-century-era, tables-for-layout, Oh-God-why-did-I-agree-to-maintain-this markup. While I don’t recommend this kinddiveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

209.
of markup, it is still common, and you can still add microdata to it. ↶ For the love of God, don’t do this<TABLE><TR><TD>Name<TD>Mark Pilgrim<TR><TD>Link<TD><A href=# onclick=goExternalLink()>http://diveintomark.org/</A></TABLE>For marking up the name property, just add an itemprop aribute on the table cell thatcontains the name. Table cells have no special rules in the microdata property value table, sothey get the default value, “the microdata property is the text content.”<TR><TD>Name<TD itemprop="name">Mark PilgrimAdding the url property looks triier. is markup doesn’t use the <a> element properly.Instead of puing the link target in the href aribute, it has nothing useful in the hrefaribute and uses Javascript in the onclick aribute to call a function (not shown) thatextracts the URL and navigates to it. For extra “holy fu, please stop doing that” bonuspoints, let’s pretend that the function also opens the link in a tiny popup window with noscroll bars. Wasn’t the internet fun last century?Anyway, you can still convert this into a microdata property, you just need to be a lilecreative. Using the <a> element directly is out of the question. e link target isn’t in thehref aribute, and there’s no way to override the rule that says “in an <a> element, look forthe microdata property value in the href aribute.” But you can add a wrapper elementaround the entire mess, and use that to add the url microdata property. ↶ This is what you get for subverting HTML<TABLE itemscope itemtype="http://data-vocabulary.org/Person"><TR><TD>Name<TD>Mark Pilgrim<TR><TD>Link<TD>diveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

210.
<span itemprop="url"><A href=# onclick=goExternalLink()>http://diveintomark.org/</A></span></TABLE>Since the <span> element has no special processing, it uses the default rule, “the microdataproperty is the text content.” “Text content” doesn’t mean “all the markup inside this element”(like you would get with, say, the innerHTML DOM property). It means “just the text,ma’am.” In this case, http://diveintomark.org/, the text content of the <a> elementinside the <span> element.To sum up: you can add microdata properties to any markup. If you’re using HTML correctly,you’ll ﬁnd it easier to add microdata than if your HTML markup sus, but it can always bedone. ❧ MARKING UP PEOPLEBy the way, the starter examples in the previous section weren’t completely made up. erereally is a microdata vocabulary for marking up information about people, and it really is thateasy. Let’s take a closer look.e easiest way to integrate microdata into a personal website is on your “about” page. Youdo have an “about” page, don’t you? If not, you can follow along as I extend this sample“about” page with additional semantics. e ﬁnal result is here: person-plus-microdata.html.Let’s look at the raw markup ﬁrst, before any microdata properties have been added:<section><img width="204" height="250"src="http://diveintohtml5.org/examples/2000_05_mark.jpg"alt="[Mark Pilgrim, circa 2000]">diveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

211.
<h1>Contact Information</h1><dl><dt>Name</dt><dd>Mark Pilgrim</dd><dt>Position</dt><dd>Developer advocate for Google, Inc.</dd><dt>Mailing address</dt><dd>100 Main Street<br>Anytown, PA 19999<br>USA</dd></dl><h1>My Digital Footprints</h1><ul><li><a href="http://diveintomark.org/">weblog</a></li><li><a href="http://www.google.com/profiles/pilgrim">Googleprofile</a></li><li><a href="http://www.reddit.com/user/MarkPilgrim">Reddit.comprofile</a></li><li><a href="http://www.twitter.com/diveintomark">Twitter</a></li></ul></section>e ﬁrst thing you always need to do is declare the vocabulary you’re using, and the scope ofthe properties you want to add. You do this by adding the itemtype and itemscopeaributes on the outermost element that contains the other elements that contain the actualdata. In this case, that’s a <section> element.<section itemscope itemtype="http://data-vocabulary.org/Person"> [Follow along! Before: person.html, aer: person-plus-microdata.html]Now you can start deﬁning microdata properties from the http://data-diveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

212.
vocabulary.org/Person vocabulary. But what are those properties? As it happens, youcan see the list of properties by navigating to data-vocabulary.org/Person in your browser. emicrodata speciﬁcation does not require this, but I’d say it’s certainly a “best practice.” Aerall, if you want developers to actually use your microdata vocabulary, you need to documentit. And where beer to put your documentation than the vocabulary URL itself? PERSON VOCABULARY Property Description name Name nickname Niname photo An image link title e person’s title (for example, “Financial Manager”) role e person’s role (for example, “Accountant”) url Link to a web page, su as the person’s home page affiliation e name of an organization with whi the person is associated (for example, an employer) friend Identiﬁes a social relationship between the person described and another person contact Identiﬁes a social relationship between the person described and another person acquaintance Identiﬁes a social relationship between the person described and another person address e location of the person. Can have the subproperties street-address, locality, region , postal-code, and country-name .e ﬁrst thing in this sample “about” page is a picture of me. Naturally, it’s marked up withan <img> element. To declare that this <img> element is my proﬁle picture, all we need todo is add itemprop="photo" to the <img> element.<img itemprop="photo" width="204" height="250"src="http://diveintohtml5.org/examples/2000_05_mark.jpg"alt="[Mark Pilgrim, circa 2000]"> [Follow along! Before: person.html, aer: person-plus-microdata.html]Where’s the microdata property value? It’s already there, in the src aribute. If you recalldiveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

213.
from the HTML5 microdata data model, the “value” of an <img> element is its src aribute.Every <img> element has a src aribute — otherwise it would just be a broken image —and the src is always a URL. See? If you’re using HTML correctly, microdata is easy.Furthermore, this <img> element isn’t alone on the page. It’s a ild element of the<section> element, the one we just declared with the itemscope aribute. Microdatareuses the parent-ild relationship of elements on the page to deﬁne the scoping of microdataproperties. In plain English, we’re saying, “is <section> element represents a person. Anymicrodata properties you might ﬁnd on the ildren of the <section> element areproperties of that person.” If it helps, you can think of the <section> element has thesubject of a sentence. e itemprop aribute represents the verb of the sentence, somethinglike “is pictured at.” e microdata property value represents the object of the sentence. is person [explicit, from <section itemscope itemtype="...">] is pictured at [explicit, from <img itemprop="photo">] http://diveintohtml5.org/examples/2000_05_mark.jpg [implicit, from <img src> aribute]e subject only needs to be deﬁned once, by puing itemscope and itemtype aributeson the outermost <section> element. e verb is deﬁned by puing theitemprop="photo" aribute on the <img> element. e object of the sentence doesn’tneed any special markup at all, because the HTML5 microdata data model says that theproperty value of an <img> element is its src aribute.Moving on to the next bit of markup, we see an <h1> header and the beginnings of a <dl>list. Neither the <h1> nor the <dl> need to be marked up with microdata. Not every piece ofHTML needs to be a microdata property. Microdata is about the properties themselves, not themarkup or headers surrounding the properties. is <h1> isn’t a property; it’s just a header.Similarly, the <dt> that says “Name” isn’t a property; it’s just a label. ↶ Boring <h1>Contact Information</h1> ⇝ <dl> Boringdiveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

214.
Boring ⇝ <dt>Name</dt> <dd>Mark Pilgrim</dd>So where is the real information? It’s in the <dd> element, so that’s where we need to put theitemprop aribute. Whi property is it? It’s the name property. Where is the propertyvalue? It’s the text within the <dd> element. Does that need to be marked up? the HTML5microdata data model says no, <dd> elements have no special processing, so the propertyvalue is just the text within the element. ↶ That’s my name, don’t wear it out<dd itemprop="name">Mark Pilgrim</dd> [Follow along! Before: person.html, aer: person-plus-microdata.html]What did we just say, in English? “is person’s name is Mark Pilgrim.” Well OK then.Onward.e next two properties are a lile triy. is is the markup, pre-microdata:<dt>Position</dt><dd>Developer advocate for Google, Inc.</dd>If you look at the deﬁnition of the Person vocabulary, the text “Developer advocate forGoogle, Inc.” actually encompasses two properties: title (“Developer advocate”) andaffiliation (“Google, Inc.”). How can you express that in microdata? e short answer is,you can’t. Microdata doesn’t have a way to break up runs of text into separate properties. Youcan’t say “the ﬁrst 18 aracters of this text is one microdata property, and the last 12aracters of this text is another microdata property.”But all is not lost. Imagine that you wanted to style the text “Developer advocate” in adiﬀerent font from the text “Google, Inc.” CSS can’t do that either. So what would you do?You would ﬁrst need to wrap the diﬀerent bits of text in dummy elements, like <span>, thenapply diﬀerent CSS rules to ea <span> element.is tenique is also useful for microdata. ere are two distinct pieces of information here:diveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

215.
a title and an affiliation. If you wrap ea piece in a dummy <span> element, youcan declare that ea <span> is a separate microdata property.<dt>Position</dt><dd><span itemprop="title">Developer advocate</span> for<span itemprop="affiliation">Google, Inc.<span></dd> [Follow along! Before: person.html, aer: person-plus-microdata.html]Tada! “is person’s title is Developer advocate. is person is employed by Google, Inc.” Twosentences, two microdata properties. A lile more markup, but a worthwhile tradeoﬀ.e same tenique is useful for marking up street addresses. e Person vocabulary deﬁnesan address property, whi itself is a microdata item. at means the address has its ownvocabulary (http://data-vocabulary.org/Address) and deﬁnes its own properties. eAddress vocabulary deﬁnes 5 properties: street-address, locality, region, postal-code , and country-name.If you’re a programmer, you are probably familiar with dot notation to deﬁne objects andtheir properties. ink of the relationship like this: Person Person.address Person.address.street-address Person.address.locality Person.address.region Person.address.postal-code Person.address.country-nameIn this example, the entire street address is contained in a single <dd> element. (Once again,the <dt> element is just a label, so it plays no role in adding semantics with microdata.)Notating the address property is easy. Just add an itemprop aribute on the <dd>element.<dt>Mailing address</dt><dd itemprop="address">diveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

216.
[Follow along! Before: person.html, aer: person-plus-microdata.html]But remember, the address property is itself a microdata item. at means we need to add theitemscope and itemtype aributes too.<dt>Mailing address</dt><dd itemprop="address" itemscopeitemtype="http://data-vocabulary.org/Address"> [Follow along! Before: person.html, aer: person-plus-microdata.html]We’ve seen all of this before, but only for top-level items. A <section> element deﬁnesitemtype and itemscope, and all the elements within the <section> element that deﬁnemicrodata properties are “scoped” within that speciﬁc vocabulary. But this is the ﬁrst timewe’ve seen nested scopes — deﬁning a new itemtype and itemscope (on the <dd>element) within an existing one (on the <section> element). is nested scope worksexactly like the HTML DOM. e <dd> element has a certain number of ild elements, allof whi are scoped to the vocabulary deﬁned on the <dd> element. Once the <dd> elementis closed with a corresponding </dd> tag, the scope reverts to the vocabulary deﬁned by theparent element (<section>, in this case).e properties of the Address suﬀer the same problem we encountered with the title andaffiliation properties. ere’s just one long run of text, but we want to break it up intoﬁve separate microdata properties. e solution is the same: wrap ea distinct piece ofinformation in a dummy <span> element, then declare microdata properties on ea <span>element.<dd itemprop="address" itemscopeitemtype="http://data-vocabulary.org/Address"><span itemprop="street-address">100 Main Street</span><br><span itemprop="locality">Anytown</span>,<span itemprop="region">PA</span><span itemprop="postal-code">19999</span><span itemprop="country-name">USA</span></dd></dl>diveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

217.
[Follow along! Before: person.html, aer: person-plus-microdata.html]In English: “is person has a mailing address. e street address part of the mailing addressis 100 Main Street. e locality part is Anytown. e region is PA. e postal code is 19999.e country name is USA.” Easy peasy.ASK PROFESSOR MARKUP ☞ Q: Is this mailing address format US-speciﬁc? A: No. e properties of the Address vocabulary are generic enough that they can describe most mailing addresses in the world. Not all addresses will have values for every property, but that’s OK. Some addresses might require ﬁing more than one “line” into a single property, but that’s OK too. For example, if your mailing address has a street address and a suite number, they would both go into the street-address subproperty: <p itemprop="address" itemscope itemtype="http://data- vocabulary.org/Address"> <span itemprop="street-address"> 100 Main Street Suite 415 </span> ... </p>ere’s one more thing on this sample “about” page: a list of URLs. e Person vocabularyhas a property for this, called url. A url property can be anything, really. (Well, it has to bediveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

218.
a URL, but you probably guessed that.) What I mean is that the url property is looselydeﬁned. e property can be any sort of URL that you want to associate with a Person: aweblog, a photo gallery, or a proﬁle on another site like Facebook or Twier.e other important thing to note here is that a single Person can have multiple urlproperties. Tenically, any property can appear more than once, but until now, we haven’ttaken advantage of that. For example, you could have two photo properties, ea pointing toa diﬀerent image URL. Here, I want to list four diﬀerent URLs: my weblog, my Googleproﬁle page, my user proﬁle on Reddit, and my Twier account. In HTML, that’s a list oflinks: four <a> elements, ea in their own <li> element. In microdata, ea <a> elementgets an itemprop="url" aribute.<h1>My Digital Footprints</h1><ul><li><a href="http://diveintomark.org/"itemprop="url">weblog</a></li><li><a href="http://www.google.com/profiles/pilgrim"itemprop="url">Google profile</a></li><li><a href="http://www.reddit.com/user/MarkPilgrim"itemprop="url">Reddit.com profile</a></li><li><a href="http://www.twitter.com/diveintomark"itemprop="url">Twitter</a></li></ul>According to the HTML5 microdata data model, <a> elements have special processing. emicrodata property value is the href aribute, not the ild text content. e text of ealink is actually ignored by a microdata processor. us, in English, this says “is person hasa URL at http://diveintomark.org/. is person has another URL athttp://www.google.com/profiles/pilgrim . is person has another URL athttp://www.reddit.com/user/MarkPilgrim . is person has another URL athttp://www.twitter.com/diveintomark .” INTRODUCING GOOGLE RICH SNIPPETSI want to step ba for just a moment and ask, “Why are we doing this?” Are we addingdiveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

219.
semantics just for the sake of adding semantics? Don’t get me wrong; I enjoy ﬁddling withangle braets as mu as the next webhead. But why microdata? Why bother?ere are two major classes of applications that consume HTML, and by extension, HTML5microdata: 1. Web browsers 2. Sear enginesFor browsers, HTML5 deﬁnes a set of DOM APIs for extracting microdata items, properties,and property values from a web page. As I write this, no browser supports this API. Not asingle one. So that’s… kind of a dead end, at least until browsers cat up and implement theclient-side APIs.e other major consumer of HTML is sear engines. What could a sear engine do withmicrodata properties about a person? Imagine this: instead of simply displaying the page titleand an excerpt of text, the sear engine could integrate some of that structured informationand display it. Full name, job title, employer, address, maybe even a lile thumbnail of aproﬁle photo. Would that cat your aention? It would cat mine.Google supports microdata as part of their Ri Snippets program. When Google’s webcrawler parses your page and ﬁnds microdata properties that conform to the http://data-vocabulary.org/Person vocabulary, it parses out those properties and stores themalongside the rest of the page data. Google even provides a handy tool to see how Google“sees” your microdata properties. Testing it against our sample microdata-enabled “about”page yields this output:ItemType: http://data-vocabulary.org/personphoto = http://diveintohtml5.org/examples/2000_05_mark.jpgname = Mark Pilgrimtitle = Developer advocateaffiliation = Google, Inc.address = Item( 1 )url = http://diveintomark.org/url = http://www.google.com/profiles/pilgrimdiveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

220.
url = http://www.reddit.com/user/MarkPilgrimurl = http://www.twitter.com/diveintomarkItem 1Type: http://data-vocabulary.org/addressstreet-address = 100 Main Streetlocality = Anytownregion = PApostal-code = 19999country-name = USAIt’s all there: the photo property from the <img src> aribute, all four URLs from the listof <a href> aributes, even the address object (listed as “Item 1”) and all ﬁve of itssubproperties.And how does Google use all of this information? at depends. ere’s no hard and fastrules about how microdata properties should be displayed, whi ones should be displayed, orwhether they should be displayed at all. If someone seares for “Mark Pilgrim,” and Googledetermines that this “about” page should rank in the results, and Google decides that themicrodata properties it originally found on that page are worth displaying, then the searresult listing might look something like this: About Mark Pilgrim Anytown PA - Developer advocate - Google, Inc. Excerpt from the page will show up here. Excerpt from the page will show up here. diveintohtml5.org/examples/person-plus-microdata.html - Cached - Similar pagese ﬁrst line, “About Mark Pilgrim,” is actually the title of the page, given in the <title>element. at’s not terribly exciting; Google does that for every page. But the second line isfull of information taken directly from the microdata annotations we added to the page.“Anytown PA” was part of the mailing address, marked up with the http://data-vocabulary.org/Address vocabulary. “Developer advocate” and “Google, Inc.” were twoproperties from the http://data-vocabulary.org/Person vocabulary (title andaffiliation, respectively).is is really quite amazing. You don’t need to be a large corporation making special dealswith sear engine vendors to customize your sear result listings. Just take ten minutes andadd a couple of HTML aributes to annotate the data you were already publishing anyway.diveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

221.
add a couple of HTML aributes to annotate the data you were already publishing anyway.ASK PROFESSOR MARKUP ☞ Q: I did everything you said, but my Google sear result listing doesn’t look any diﬀerent. What gives? A: “ Google does not guarantee that markup on any given page or site will be used in sear results.” But even if Google decides not to use your microdata annotations, another sear engine might. Like the rest of HTML5, microdata is an open standard that anyone can implement. It’s your job to provide as mu data as possible. Let the rest of the world decide what to do with it. ey might surprise you! ❧ MARKING UP ORGANIZATIONSMicrodata isn’t limited to a single vocabulary. “About” pages are nice, but you probably onlyhave one of them. Still hungry for more? Let’s learn how to mark up organizations andbusinesses.Here is a sample page of business listings . Let’s look at the original HTML markup, withoutmicrodata.<article>diveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

222.
<h1>Google, Inc.</h1><p>1600 Amphitheatre Parkway<br>Mountain View, CA 94043<br>USA</p><p>650-253-0000</p><p><a href="http://www.google.com/">Google.com</a></p></article> [Follow along! Before: organization.html, aer: organization-plus-microdata.html]Short and sweet. All the information about the organization is contained within the<article> element, so let’s start there.<article itemscope itemtype="http://data-vocabulary.org/Organization">As with marking up people, you need to set the itemscope and itemtype aributes on theoutermost element. In this case, the outermost element is an <article> element. eitemtype aribute declares the microdata vocabulary you’re using (in this case,http://data-vocabulary.org/Organization), and the itemscope aribute declaresthat all of the properties you set on ild elements relate to this vocabulary.So what’s in the Organization vocabulary? It’s simple and straightforward. In fact, some of itshould already look familiar. ORGANIZATION VOCABULARY Property Description name e name of the organization (for example, “Inite”) url Link to the organization’s home page address e location of the organization. Can contain the subproperties street-address, locality , region, postal-code, and country-name . tel e telephone number of the organization geo Speciﬁes the geographical coordinates of the location. Always contains two subproperties, latitude and longitude.diveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

223.
e ﬁrst bit of markup within the outermost <article> element is an <h1>. is <h1>element contains the name of a business, so we’ll put an itemprop="name" aributedirectly on the <h1> element.<h1 itemprop="name">Google, Inc.</h1> [Follow along! Before: organization.html, aer: organization-plus-microdata.html]According to the HTML5 microdata data model, <h1> elements don’t need any specialprocessing. e microdata property value is simply the text content of the <h1> element. InEnglish, we just said “the name of the Organization is Google, Inc.”Next up is a street address. Marking up the address of an Organization works exactly thesame way as marking up the address of a Person . First, add an itemprop="address"aribute to the outermost element of the street address (in this case, a <p> element). atstates that this is the address property of the Organization. But what about the properties ofthe address itself? We also need to deﬁne the itemtype and itemscope aributes to saythat this is an Address item that has its own properties.<p itemprop="address" itemscopeitemtype="http://data-vocabulary.org/Address"> [Follow along! Before: organization.html, aer: organization-plus-microdata.html]Finally, we need to wrap ea distinct piece of information in a dummy <span> element sowe can add the appropriate microdata property name (street-address, locality,region , postal-code, and country-name ) on ea <span> element.<p itemprop="address" itemscopeitemtype="http://data-vocabulary.org/Address"><span itemprop="street-address">1600 Amphitheatre Parkway</span><br><span itemprop="locality">Mountain View</span>,<span itemprop="region">CA</span><span itemprop="postal-code">94043</span><br><span itemprop="country-name">USA</span></p>diveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

224.
[Follow along! Before: organization.html, aer: organization-plus-microdata.html]In English, we just said “is organization has an address. e street address part is 1600Amphitheatre Parkway. e locality is Mountain View. e region part is CA. e postalcode is 94043. e name of the country is USA.”Next up: a telephone number for the Organization. Telephone numbers are notoriously triy,and the exact syntax is country-speciﬁc. (And if you want to call another country, it’s evenworse.) In this example, we have a United States telephone number, in a format suitable forcalling from elsewhere in the United States.<p itemprop="tel">650-253-0000</p> [Follow along! Before: organization.html, aer: organization-plus-microdata.html](Hey, in case you didn’t notice, the Address vocabulary went out of scope when its <p>element was closed. Now we’re ba to deﬁning properties in the Organization vocabulary.)If you want to list more than one telephone number — maybe one for United Statescustomers and one for international customers — you can do that. Any microdata property canbe repeated. Just make sure ea telephone number is in its own HTML element, separatefrom any label you may give it.<p>US customers: <span itemprop="tel">650-253-0000</span><br>UK customers: <span itemprop="tel">00 + 1* + 6502530000</span></p>According to the HTML5 microdata data model, neither the <p> element nor the <span>element have special processing. e value of the microdata tel property is simply the textcontent. e Organization microdata vocabulary makes no aempt to subdivide the diﬀerentparts of a telephone number. e entire tel property is just free-form text. If you want toput the area code in parentheses, or use spaces instead of dashes to separate the numbers, youcan do that. If a microdata-consuming client wants to parse the telephone number, that’sentirely up to them.diveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

225.
Next, we have another familiar property: url. Just like associating a URL with a Person, youcan associate a URL with an Organization. is could be the company’s home page, a contactpage, product page, or anything else. If it’s a URL about, from, or belonging to theOrganization, mark it up with an itemprop="url" aribute.<p><a itemprop="url" href="http://www.google.com/">Google.com</a></p> [Follow along! Before: organization.html, aer: organization-plus-microdata.html]According to the HTML5 microdata data model, the <a> element has special processing. emicrodata property value is the value of the href aribute, not the link text. In English, thissays “this organization is associated with the URL http://www.google.com/.” It doesn’tsay anything more speciﬁc about the association, and it doesn’t include the link text“Google.com.”Finally, I want to talk about geolocation. No, not the W3C Geolocation API. is is abouthow to mark up the physical location for an Organization, using microdata.To date, all of our examples have focused on marking up visible data. at is, you have an<h1> with a company name, so you add an itemprop aribute to the <h1> element todeclare that the (visible) header text is, in fact, the name of an Organization. Or you have an<img> element that points to a photo, so you add an itemprop aribute to the <img>element to declare that the (visible) image is a photo of a Person.In this example, geolocation information isn’t like that. ere is no visible text that gives theexact latitude and longitude (to four decimal places!) of the Organization. In fact, theorganization.html example (without microdata) has no geolocation information at all. It has alink to Google Maps, but even the URL of that link does not contain latitude and longitudecoordinates. (It contains similar information in a Google-speciﬁc format.) But even if we hada link to a hypothetical online mapping service that did take latitude and longitudecoordinates as URL parameters, microdata has no way of separating out the diﬀerent parts ofa URL. You can’t declare that the ﬁrst URL query parameter is the latitude and the secondURL query parameter is the longitude and the rest of the query parameters are irrelevant.To handle edge cases like this, HTML5 provides a way to annotate invisible data. istenique should only be used as a last resort. If there is a way to display or render the datayou care about, you should do so. Invisible data that only maines can read tends to “godiveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

226.
stale” quily. at is, someone will come along later and update the visible text but forget toupdate the invisible data. is happens more oen than you think, and it will happen to youtoo.Still, there are cases where invisible data is unavoidable. Perhaps your boss really wantsmaine-readable geolocation information but doesn’t want to cluer up the interface withpairs of incomprehensible six-digit numbers. Invisible data is the only option. e only savinggrace here is that you can put the invisible data immediately aer the visible text that itdescribes, whi may help remind the person who comes along later and updates the visibletext that they need to update the invisible data right aer it.In this example, we can create a dummy <span> element within the same <article>element as all the other Organization properties, then put the invisible geolocation data insidethe <span> element.<span itemprop="geo" itemscopeitemtype="http://data-vocabulary.org/Geo"><meta itemprop="latitude" content="37.4149" /><meta itemprop="longitude" content="-122.078" /></span></article> [Follow along! Before: organization.html, aer: organization-plus-microdata.html]Geolocation information is deﬁned in its own vocabulary, like the address of a Person orOrganization. erefore, this <span> element needs three aributes: 1. itemprop="geo" says that this element represents the geo property of the surrounding Organization 2. itemtype="http://data-vocabulary.org/Geo" says whi microdata vocabulary this element’s properties conform to 3. itemscope says that this element is the enclosing element for a microdata item with its own vocabulary (given in the itemtype aribute). All the properties within this element are properties of http://data-vocabulary.org/Geo, not the surrounding http://data-vocabulary.org/Organization .e next big question that this example answers is, “How do you annotate invisible data?”diveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

227.
You use the <meta> element. In previous versions of HTML, you could only use the <meta>element within the <head> of your page. In HTML5, you can use the <meta> elementanywhere. And that’s exactly what we’re doing here.<meta itemprop="latitude" content="37.4149" /> [Follow along! Before: organization.html, aer: organization-plus-microdata.html]According to the HTML5 microdata data model, the <meta> element has special processing.e microdata property value is the content aribute. Since this aribute is never visiblydisplayed, we have the perfect setup for unlimited quantities of invisible data. With greatpower comes great responsibility. In this case, the responsibility is on you to ensure that thisinvisible data stays in sync with the visible text around it.ere is no direct support for the Organization vocabulary in Google Ri Snippets, so I don’thave any prey sample sear result listings to show you. But organizations feature heavily inthe next two case studies: events and reviews, and those are supported by Google RiSnippets. ❧ MARKING UP EVENTSShit happens. Some shit happens at pre-determined times. Wouldn’t it be nice if you couldtell sear engines exactly when shit was about to happen? ere’s an angle braet for that.Let’s start by looking at a sample sedule of my speaking engagements .<article><h1>Google Developer Day 2009</h1><img width="300" height="200"src="http://diveintohtml5.org/examples/gdd-2009-prague-pilgrim.jpg"alt="[Mark Pilgrim at podium]"><p>diveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

229.
summary e name of the event url Link to the event details page location e location or venue of the event. Can optionally be represented by a nested Organization or Address. description A description of the event startDate e starting date and time of the event in ISO date format endDate e ending date and time of the event in ISO date format duration e duration date of the event in ISO duration format eventType e category of the event (for example, “Concert” or “Lecture”). is is a freeform string, not an enumerated aribute. geo Speciﬁes the geographical coordinates of the location. Always contains two subproperties, latitude and longitude. photo A link to a photo or image related to the evente event’s name is in an <h1> element. According to the HTML5 microdata data model,<h1> elements have no special processing. e microdata property value is simply the textcontent of the <h1> element. All we need to do is add the itemprop aribute to declare thatthis <h1> element contains the name of the event.<h1 itemprop="summary">Google Developer Day 2009</h1> [Follow along! Before: event.html, aer: event-plus-microdata.html]In English, this says, “e name of this event is Google Developer Day 2009.”is event listing has a photo, whi can be marked up with the photo property. As youwould expect, the photo is already marked up with an <img> element. Like the photoproperty in the Person vocabulary, an Event photo is a URL. Since the HTML5 microdata datamodel says that the property value of an <img> element is its src aribute, the only thingwe need to do is add the itemprop aribute to the <img> element.<img itemprop="photo" width="300" height="200"src="http://diveintohtml5.org/examples/gdd-2009-prague-pilgrim.jpg"alt="[Mark Pilgrim at podium]"> [Follow along! Before: event.html, aer: event-plus-microdata.html]diveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

230.
In English, this says, “e photo for this event is athttp://diveintohtml5.org/examples/gdd-2009-prague-pilgrim.jpg.”Next up is a longer description of the event, whi is just a pargaraph of freeform text.<p itemprop="description">Google Developer Days are a chance tolearn about Google developer products from the engineers who builtthem. This one-day conference includes seminars and “officehours” on web technologies like Google Maps, OpenSocial,Android, AJAX APIs, Chrome, and Google Web Toolkit.</p> [Follow along! Before: event.html, aer: event-plus-microdata.html]e next bit is something new. Events generally occur on speciﬁc dates and start and end atspeciﬁc times. In HTML5, dates and times should be marked up with the <time> element,and we are already doing that here. So the question becomes, how do we add microdatapropeties to these <time> elements? Looking ba at the HTML5 microdata data model, wesee that the <time> element has special processing. e value of a microdata property on a<time> element is the value of the datetime aribute. And hey, the startDate andendDate properties of the Event vocabulary take an ISO-style date, just like the datetimeproperty of a <time> element. Once again, the semantics of the core HTML vocabularydovetail nicely with semantics of our custom microdata vocabulary. Marking up start and enddates with microdata is as simple as 1. Using HTML correctly in the ﬁrst place (using <time> elements to mark up dates and times), and 2. Adding a single itemprop aribute<p><time itemprop="startDate" datetime="2009-11-06T08:30+01:00">2009November 6, 8:30</time>&ndash;<time itemprop="endDate" datetime="2009-11-06T20:30+01:00">20:30</time></p> [Follow along! Before: event.html, aer: event-plus-microdata.html]diveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

231.
In English, this says, “is event starts on November 6, 2009, at 8:30 in the morning, and goesuntil November 6, 2009, at 20:30 (times local to Prague, GMT+1).”Next up is the location property. e deﬁnition of the Event vocabulary says that this canbe either an Organization or an Address. In this case, the event is being held at a venue thatspecializes in conferences, the Congress Center in Prague. Marking it up as an Organizationallows us to include the name of the venue as well as its address.First, let’s declare that the <p> element that contains the address is the location property ofthe Event, and that this element is also its own microdata item that conforms to thehttp://data-vocabulary.org/Organization vocabulary.<p itemprop="location" itemscopeitemtype="http://data-vocabulary.org/Organization"> [Follow along! Before: event.html, aer: event-plus-microdata.html]Next, mark up the name of the Organization by wrapping the name in a dummy <span>element and adding an itemprop aribute to the <span> element.<span itemprop="name">Congress Center</span><br> [Follow along! Before: event.html, aer: event-plus-microdata.html]Due to the microdata scoping rules, this itemprop="name" is deﬁning a property in theOrganization vocabulary, not the Event vocabulary. e <p> element deﬁned the beginning ofthe scope of the Organization properties, and that <p> element hasn’t yet been closed with an</p> tag. Any microdata properties we deﬁne here are properties of the most-recently-scopedvocabulary. Nested vocabularies are like a sta. We haven’t yet popped the sta, so we’restill talking about properties of the Organization.In fact, we’re going to add a third vocabulary onto the sta: an Address for the Organizationfor the Event.<span itemprop="address" itemscopediveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

232.
itemtype="http://data-vocabulary.org/Address"> [Follow along! Before: event.html, aer: event-plus-microdata.html]Once again, we want to mark up every piece of the address as a separate microdata property,so we need a slew of dummy <span> elements to hang our itemprop aributes onto. (IfI’m going too fast for you here, go ba and read about marking up the address of a Personand marking up the address of an Organization .)<span itemprop="street-address">5th května 65</span><br><span itemprop="postal-code">140 21</span><span itemprop="locality">Praha 4</span><br><span itemprop="country-name">Czech Republic</span> [Follow along! Before: event.html, aer: event-plus-microdata.html]ere are no more properties of the Address, so we close the <span> element that started theAddress scope, and pop the sta.</span>ere are no more properties of the Organization, so we close the <p> element that startedthe Organization scope, and pop the sta again.</p>Now we’re ba to deﬁning properties on the Event. e next property is geo, to representthe physical location of the Event. is uses the same Geo vocabulary that we used to markup the physical location of an Organization in the previous section. We need a <span>element to act as the container; it gets the itemtype and itemscope aributes. Within that<span> element, we need two <meta> elements, one for the latitude property and one forthe longitude property.<span itemprop="geo" itemscopeitemtype="http://data-vocabulary.org/Geo"><meta itemprop="latitude" content="50.047893" />diveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

233.
<meta itemprop="longitude" content="14.4491" /></span> [Follow along! Before: event.html, aer: event-plus-microdata.html]And we’ve closed the <span> that contained the Geo properties, so we’re ba to deﬁningproperties on the Event. e last property is the url property, whi should look familiar.Associating a URL with an Event works the same way as associating a URL with a Personand associating a URL with an Organization. If you’re using HTML correctly (marking uphyperlinks with <a href>), then declaring that the hyperlink is a microdata url property issimply a maer of adding the itemprop aribute.<p><a itemprop="url"href="http://code.google.com/intl/cs/events/developerday/2009/home.html">GDD/Prague home page</a></p></article> [Follow along! Before: event.html, aer: event-plus-microdata.html]e sample event page also lists a second event, my speaking engagement at the ConFooconference in Montréal. For brevity, I’m not going to go through that markup line by line. It’sessentially the same as the event in Prague: an Event item with nested Geo and Addressitems. I just mention it in passing to reiterate that a single page can have multiple events,ea marked up with microdata. THE RETURN OF GOOGLE RICH SNIPPETSAccording to Google’s Ri Snippets Testing Tool , this is the information that Google’scrawlers will glean from our sample event listing page :ItemType: http://data-vocabulary.org/Eventsummary = Google Developer Day 2009diveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

235.
linearize the sample output and show you the grouping of nested items and their properties.Here is how Google might oose to represent this sample page in its sear results. (Again, Ihave to preface this with the disclaimer that this is just an example. Google may ange theformat of their sear results at any time, and there is no guarantee that Google will even payaention to your microdata markup. Sorry to sound like a broken record, but our lawyersmake me say these things.) Mark Pilgrim’s event calendar Excerpt from the page will show up here. Excerpt from the page will show up here. Google Developer Day 2009 Fri, Nov 6 Congress Center, Praha 4, Czech Republic ConFoo.ca 2010 Wed, Mar 10 Hilton Montreal Bonaventure, Montréal, Québec, Canada diveintohtml5.org/examples/event-plus-microdata.html - Cached - Similar pagesAer the page title and auto-generated excerpt text, Google starts using the microdata markupwe added to the page to display a lile table of events. Note the date format: “Fri, Nov 6.”at is not a string that appeared anywhere in our HTML or microdata markup. We used twofully qualiﬁed ISO-formaed strings, 2009-11-06T08:30+01:00 and 2009-11-06T20:30+01:00. Google took those two dates, ﬁgured out that they were on the same day,and decided to display a single date in a more friendly format.Now look at the physical addresses. Google ose to display just the venue name + locality +country, not the exact street address. is is made possible by the fact that we split up theaddress into ﬁve subproperties — name, street-address, region, locality, andcountry-name — and marked up ea part of the address as a diﬀerent microdata property.Google takes advantage of that to show an abbreviated address. Other consumers of the samemicrodata markup might make diﬀerent oices about what to display or how to display it.ere’s no right or wrong oice here. It’s up to you to provide as mu data as possible, asaccurately as possible. It’s up to the rest of the world to interpret it. ❧ MARKING UP REVIEWSHere’s another example of making the web (and possibly sear result listings) beer throughdiveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

236.
markup: business and product reviews.is is a short review I wrote of my favorite pizza place near my house. (is is a realrestaurant, by the way. If you’re ever in Apex, NC, I highly recommend it.) Let’s look at theoriginal markup:<article><h1>Anna’s Pizzeria</h1><p>★★★★☆ (4 stars out of 5)</p><p>New York-style pizza right in historic downtown Apex</p><p>Food is top-notch. Atmosphere is just right for a “neighborhoodpizza joint.” The restaurant itself is a bit cramped; if you’reoverweight, you may have difficulty getting in and out of yourseat and navigating between other tables. Used to give freegarlic knots when you sat down; now they give you plain breadand you have to pay for the good stuff. Overall, it’s a winner.</p><p>100 North Salem Street<br>Apex, NC 27502<br>USA</p><p>— reviewed by Mark Pilgrim, last updated March 31, 2010</p></article> [Follow along! Before: review.html, aer: review-plus-microdata.html]is review is contained in an <article> element, so that’s where we’ll put the itemtypeand itemscope aributes. e namespace URL for this vocabulary is http://data-vocabulary.org/Review .<article itemscope itemtype="http://data-vocabulary.org/Review"> [Follow along! Before: review.html, aer: review-plus-microdata.html]What are the available properties in the Review vocabulary? I’m glad you asked.diveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

237.
REVIEW VOCABULARY Property Description itemreviewed e name of the item being reviewed. Can be a product, service, business, &c. rating A numerical quality rating for the item, on a scale from 1 to 5. Can also be a nested http://data-vocabulary.org/Rating vocabulary to use a nonstandard scale. reviewer e name of the author who wrote the review dtreviewed e date that the item was reviewed in ISO date format summary A short summary of the review description e body of the reviewe ﬁrst property is simple: itemreviewed is just text, and here it’s contained in an <h1>element, so that’s where we should put the itemprop aribute.<h1 itemprop="itemreviewed">Anna’s Pizzeria</h1> [Follow along! Before: review.html, aer: review-plus-microdata.html]I’m going to skip over the actual rating and come ba to that at the end.e next two properties are also straightforward. e summary property is a short descriptionof what you’re reviewing, and the description property is the body of the review.<p itemprop="summary">New York-style pizza right in historic downtownApex</p><p itemprop="description">Food is top-notch. Atmosphere is just right for a “neighborhoodpizza joint.” The restaurant itself is a bit cramped; if you’reoverweight, you may have difficulty getting in and out of yourseat and navigating between other tables. Used to give freegarlic knots when you sat down; now they give you plain breadand you have to pay for the good stuff. Overall, it’s a winner.</p> [Follow along! Before: review.html, aer: review-plus-microdata.html]diveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

238.
e location and geo properties aren’t anything we haven’t taled before. (If you’re justtuning in, e out marking up the address of a Person , marking up the address of anOrganization, and marking up geolocation information from earlier in this apter.)<p itemprop="location" itemscopeitemtype="http://data-vocabulary.org/Address"><span itemprop="street-address">100 North Salem Street</span><br><span itemprop="locality">Apex</span>,<span itemprop="region">NC</span><span itemprop="postal-code">27502</span><br><span itemprop="country-name">USA</span></p><span itemprop="geo" itemscopeitemtype="http://data-vocabulary.org/Geo"><meta itemprop="latitude" content="35.730796" /><meta itemprop="longitude" content="-78.851426" /></span> [Follow along! Before: review.html, aer: review-plus-microdata.html]e ﬁnal line presents a familiar problem: it contains two bits of information in one element.e name of the reviewer is Mark Pilgrim, and the review date is March 31, 2010.How do we mark up these two distinct properties? Wrap them in their own elements and putan itemprop aribute on ea element. In fact, the date in this example should have beenmarked up with a <time> element in the ﬁrst place, so that provides a natural hook onwhi to hang our itemprop aribute. e reviewer name can just be wrapped in a dummy<span> element.<p>— <span itemprop="reviewer">Mark Pilgrim</span>, last updated<time itemprop="dtreviewed" datetime="2010-03-31">March 31, 2010</time></p></article> [Follow along! Before: review.html, aer: review-plus-microdata.html]diveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

239.
OK, let’s talk ratings. e triiest part of marking up a review is the rating. By default,ratings in the Review vocabulary are on a scale of 1–5, 1 being “terrible” and 5 being“awesome.” If you want to use a diﬀerent scale, you can deﬁnitely do that. But let’s talkabout the default scale ﬁrst.<p>★★★★☆ (<span itemprop="rating">4</span> stars out of 5)</p> [Follow along! Before: review.html, aer: review-plus-microdata.html]If you’re using the default 1–5 scale, the only property you need to mark up is the ratingitself (4, in this case). But what if you want to use a diﬀerent scale? You can do that; you justneed to declare the limits of the scale you’re using. For example, if you wanted to use a 0–10point scale, you would still declare the itemprop="rating" property, but instead of givingthe rating value directly, you would use a nested vocabulary of http://data-vocabulary.org/Rating to declare the worst and best values in your custom scale andthe actual rating value within that scale.<p itemprop="rating" itemscopeitemtype="http://data-vocabulary.org/Rating">★★★★★★★★★☆(<span itemprop="value">9</span> on a scale of<span itemprop="worst">0</span> to<span itemprop="best">10</span>)</p>In English, this says “the product I’m reviewing has a rating value of 9 on a scale of 0–10.”Did I mention that review microdata could aﬀect sear result listings? Oh yes, it can. Here isthe “raw data” that the Google Ri Snippets tool extracted from my microdata-enhancedreview:ItemType: http://data-vocabulary.org/Reviewitemreviewed = Anna’s Pizzeriarating = 4summary = New York-style pizza right in historic downtown Apexdiveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

241.
Microdata resources: Live microdata playground HTML5 microdata speciﬁcationGoogle Ri Snippets resources: About ri snippets and structured data Marking up contact and social networking information Businesses & organizations Events Reviews Review ratings Google Ri Snippets Testing Tool Google Ri Snippets Tips and Tris ❧is has been ‘“Distributed,” “Extensibility,” & Other Fancy Words.’ e full table of contentshas more if you’d like to keep reading.DID YOU KNOW? In association with Google Press, O’Reilly is distributing this book in a variety of formats, including paper, ePub, Mobi, and DRM-free PDF. e paid edition is called “HTML5: Up & Running,” and it is available now. is apter is included in the paid edition. If you liked this apter and want to show your appreciation, you can buy “HTML5: Up & Running” with this aﬃliate link or buy an electronic edition directly from O’Reilly. You’ll get a book, and I’ll get adiveintohtml5.org “DISTRIBUTED,” “EXTENSIBILITY,” & OTHER FANCY WORDS

251.
XMLHpRequest Level 2JavaScript libraries: Modernizr, an HTML5 detection library ❧is has been “e All-In-One Almost-Alphabetical No-Bullshit Guide to DetectingEverything.” e full table of contents has more if you’d like to keep reading.DID YOU KNOW? In association with Google Press, O’Reilly is distributing this book in a variety of formats, including paper, ePub, Mobi, and DRM-free PDF. e paid edition is called “HTML5: Up & Running,” and it is available now. is appendix is included in the paid edition. If you liked this appendix and want to show your appreciation, you can buy “HTML5: Up & Running” with this aﬃliate link or buy an electronic edition directly from O’Reilly. You’ll get a book, and I’ll get a bu. I do not currently accept direct donations. Copyright MMIX–MMX Mark Pilgrim powered by Google™ Searchdiveintohtml5.org THE ALL-IN-ONE ALMOST-ALPHABETICAL NO-BULLSHIT GUIDE TO DETECTING EVERYTHING