Web 3D, Part 1: Introduction

This series of articles was inspired by my quest to find a functional "Web 3D" solution for several startup ideas I have in development (no details now, other that they could really use "Web 3D"). The need actually goes back to the 1990s, for me personally, when I was involved with several other heavily Networked 3D startups, the most successful of which decided to roll their own custom solutions and make them work.

I’ll describe those in more detail in a minute. But for readers of little patience, the one-sentence summary of this sprawling multi-part article is to ask the question "Where is Web 3D?" and then try to answer it, both in terms of history, the present, and the future we can hopefully look forward to. In the process, I’ve assembled interviews with a number of people working in the field and interspersed the narrative with my own somewhat skeptical real-life experience, having been at this VR thing for close to 15 years now.

For example, in 1999, at the early stages of what would later become Google Earth, we wanted a way to mix dynamic high-performance 2D and 3D content, both company-hosted (the Earth imagery, data layers, etc..) and remotely-linked (everything else), ideally rendered inside any standard web browser. That meant seeing all 2D/3D content in the same window, at the same time, mixing like they do in all those crazy science fiction movies–you know, as if they were designed that way.

The advantages would have been immediate and profound: no special app to download, easy hyperlinks to outside documents (and vice versa), and perhaps even a standard way to geo-reference web content such that a spinning 3D Earth could live in the corner of your browser, full time, zooming down and intermixing with 2D content as desired.

Were we asking too much? From a technological perspective, no. EarthViewer could run on unaccelerated laptops. But from a marketplace perhaps unsure of what 3D was even good for, yes, very much.

Back then, web-based maps were loaded one slow jpeg at a time. And the only approximate way to do what we wanted was to use a browser plug-in, an embedded object or a dreaded control, installed and invoked, with all of the drawbacks that entailed. For example, your standard browser plug-in tended to re-initialize each time a web page was loaded, causing lengthy, unseemly delays. And until recently, making the "traditional" 2D elements of a conjoined 2D/3D web ensemble "come alive" has been rather hard, assuming you even got that far. Those 2D parts have tended to just sit there at best, waiting off to the side, more like pages from a magazine than an interactive display.

To date, the most common browser plug-in for 3D is probably Shockwave. But I’m not presently aware of anyone using it for more than cute games or fancy one-off interfaces. It certainly wasn’t suitable for Google Earth, with our custom high-performance rendering/streaming pipeline. And it still suffered the fate of all plug-ins, being built on a programming model where rich media is invoked and then dismissed, but only superficially intermixed. Even today, we’re just barely at the stage where streaming internet TV apps work everywhere, ala Google and YouTube.

So, as fate would have it, we rolled our own browser: Earthviewer. It did fairly well. And lately, Google has taken that to a much richer extreme in Google Earth. They’re exposed much of the internal functionality in a way that’s more browser-like, going so far as to publish their own equivalent of the Document Object Model, their equivalent of HTML for marking up the 2D and 3D user interface and adding more 3D content. GE has, in a strong way, gone from wanting to live inside a browser to being its own browser framework, optimized for 3D and geospatial content.

Second Life, for which I also did some work (not long after I left Google), also rolled their own browser, for similar reasons and with similar results. Their client implements much of the functionality we’d normally expect in "Web 3D," albeit in a proprietary and highly literal space (more on that later). For importing 2D web content, they chose to employ a technique that could render web pages inside their world as texture maps applied to 3D objects. And it’s served them well enough so far. They’re also working on making their UI more scriptable, HTML-like, and adding more user-customizability, just as browsers have done.

A dozen other top-tier and highly-successful "Networked 3D" apps exist as well, largely as islands in the net, largely custom and almost totally set apart. They commonly use a markup language (HTML or something with slightly better or worse functionality) for internally describing their user-interfaces. Most of them employ server-side scripting and networked relational databases in a way that enterprise developers might find familiar. And they all use some sort of meta-data for tracking and manipulating 3D objects.

They are using the apparent technology of Web 3D, but without the benefits of the open web. These apps are in some sense "tied together," not by any adopted standards or ubiquitous UI metaphors, but by parallel technologies and perhaps some clever user-side hacks (like OGLE, which lets you copy OpenGL-rendered objects from one app for later re-use in another). They’ve all bypassed the old browser problems and solved things in their own roughly similar, roughly equivalent, but still-incompatible ways.

In a real sense, it’s a fractured landscape. And a lot of that is completely understandable, even intentional to some extent. For example, it would be jarring (though interesting to some) to see the Empire State Building pop into the middle of World of Warcraft. It would probably take a substantial extra effort to add avatars and real physical interactions to Google Earth, when the customer need hasn’t yet been shown. And Second Life objects, without the Linden’s servers bringing them to life, would be next to meaningless to any other system. They’d be like Pinocchio, without the magic, and without the wood.

The real value of Web3D, as we’ll see later on, isn’t simply in transporting 3D objects between walled-in domains. It’s about tying everything together in an endless 3D Web (hence the name), much as the traditional 2D web ties any number of independent services and users together as first class citizens of one vast conceptual hyperlinked book, with a billion pages, a trillion citations, and a few really useful indices.

It’s more about the infrastructure, the interface, the cross-references, the mashups, and the novel uses we find through open experimentation and vendor-neutral free enterprise. That’s what makes Web 3D important as a technology and a movement for the next generation of Web. But feels as if Web 3D is stuck in early 90s, in the last days of CompuServe, isolated AOL, and GEnie, bringing early adopters to some unique, if campy, activities, but only showing a fraction of the potential that comes from the big open space.

That is what I see today. But it is finally beginning to change.

My thesis is simple: the standard web browser can remain the center of daily networked life, the framework of evolution for the web, if it evolves to embrace 3D as a first-class citizen, along with text and images and links. It may be called the Web 3D browser, or it may simply be called Firefox, Opera, and IE. Either way, for people like me, that evolution can’t happen soon enough.

Overview

Now, or in the near future, Firefox (and certainly other browsers too) will be capable of rendering 3D graphics windows in arbitrary areas inside the main browser window in a fairly system-independent way. It’s an active area of development, with some important implications down the line. Plugins for rendering X3D content (the successor to VRML) already exist, most notably from Media Machines. And big 3D worlds, like Second Life, have the possibility, with improvements, to expand beyond their current walled-gardens into a broader shared framework for Web3D.

But as yet, I’ve seen no single coherent vision of how to bring 3D to the masses in a seamless, interoperable way. There are many ideas, however, and all of them have put considerable thought into the matter.

Jerry Paffendorf has set out to build a Metaverse Roadmap, a living document which aims to unite the disparate efforts in a big shared vision. This article will include an interview with him, as well as Tony Parisi of Media Machines. Down the road, I’ve slated to interview Cory Ondjreka, CTO of Linden Research (makers of Second Life). And to more fully round out discussion of Web 3D in browsers, I’ve included Vlad Vukicevic, the Mozilla developer perhaps most directly responsible for adding 3D to Firefox. If I can find someone responsible for 3D browsing technology at Microsoft (they have some older efforts I’m aware of, but even their Virtual Earth, for example, is pretty 2D), I’d love to interview them too. Open call, MS.

I’m not going to try to recreate the Metaverse Roadmap here, except to dig into some problem areas which I think, perhaps for historical or political reasons, the roadmap may not touch. I am going to try to resolve why, despite the prevalence of 3D hardware out there, the current set of 3D standards have not taken off, whether that’s an issue with the standards, technology, developers, end-users, or any combination of the lot. And I will attempt to draw analogies between what made the Web take off, and what’s still missing in Web3D.