Software [In]security: Securing Web 3.0

The heady promise of Web 3.0 is true multipurpose apps mixing public and private data sources. But Gary McGraw warns that we haven't yet solved (or perhaps even thoroughly considered) some of the serious security issues involved.

Like this article? We recommend

Securing Web 3.0 presents some serious challenges worth considering. Though we have yet to secure simple Web 1.0 applications or more complex SOA-based Web 2.0 applications, the world marches on while security plays perpetual catch-up. According to marketers and those at the cutting edge of Web development, the time has come for Web 3.0.

Web 3.0, a.k.a. the Semantic Web

The big idea behind Web 3.0 is to be able to create applications impervious to data representation issues, in order to enable automated and flexible integration between public and private data sources. Think about the problem this way: A Web app built to process data that comes to it as an internal Excel spreadsheet looks significantly different from a Web app constructed to process data that arrives in the form of a public database[md]which in turn looks significantly different from a Web app designed to interact with a specially created PC-based client component. Wouldn't it be nice to be able to build an app that can handle either or all data sources without complete recoding?

To do this, Web 3.0 relies on lots of XML tagging and bagging, URIs to allow pass by reference, and the idea that everything is addressable. This move allows for what practitioners call "representational state transfer" (ReST), in a resource-oriented architecture, which provides the power to create "mash-ups" and other advanced apps more easily.

Think of all of this as the Web on steroids, with pointers that point to globs of stuff that comes categorized and tagged with a consistent ontology. In some sense, this is one of the dreams of symbolic AI, with Doug Lenat's Cyc project providing a prime example. That's why some people call Web 3.0 "the semantic Web." (There are lots of reasons the symbolists got things wrong, as any of Doug Hofstadter's Ph.D. students knows, but that's not what this column is about.)

Web 3.0 is not a replacement for the Web we know today. Rather, it's an extension of the information with which we already work. New tags and data exchange formats that provide meaning to things already in the incubator stage. In any case, a majority of Web 3.0 instances today create "YourSpace" situations in which data is protected in business-controlled silos (but big plans exist to open things up).

Sometimes a demo can make abstractions like this easier to understand by making them tangible. To play around with a Web 3.0 app today, surf over to the SIMILE Project website and try your luck.

Let's zoom in. The problem in online games is that much of the game itself runs on literally millions of untrusted client PCs that connect to central servers. The idea of trusting state information (or data) that has been reported to central services by those millions of untrusted clients may seem naïve from a security perspective, and it leads to lots of potential for exploit, but that's how things work today. In the Web 3.0 world, trust boundaries get even trickier. In this case entire applications are constructed by mashing up data and functionality from all over the Web. Who is to say which data will be trusted and which pieces of functionality actually do what they say they do?

Practitioners of Web 3.0 invoke a "web of trust" idea, echoing the excellent thinking of PGP's Phil Zimmerman, but those of us in security know that the PKI web of trust as it currently exists leaves much to be desired. Challenges that remain open include revoking identities, privacy concerns, and figuring out how to evolve trust over time. In the meantime, attackers can literally play with the "minds" of Web 3.0 apps by falsifying XML data, providing services that don't do what they claim, and interposing on legitimate services. It ain't pretty.

Another big concern in Web 3.0 is control over data. In the simplest cases, this devolves to access control. As many of us in commercial security know, the Bell/LaPadula access control matrix has outlived its utility as a data security concept. Simply put, the matrices we're trying to manage are already too big, and Web 3.0 does all it can to make them even bigger. But things get much more complicated when you try to anticipate the eventual uses of the data you're providing to the Web. Very little thinking has gone into the privacy implications of massive data mash-ups. Think about how two separate databases that seem innocuous on their own (such as Google Maps and who owns Blu-ray DVD players) might be useful to a Robin Hood intent on preying on the rich, and you see a glimmer of the problem.

Trust transitivity is a huge can of worms. If I allow you access to my tagged/bagged good data, thinking that you will control access to it, how can I anticipate that you won't make a mistake and publish it to the world for anyone to use? Attackers may set up services to do this on purpose. Likewise, data ambiguity will be something that attackers manipulate on purpose. "Semantic injection" errors, anyone?

On the positive side, the idea of passing by reference instead of exposing objects in transit could be good for security. Ultimately, it may allow more control over the stuff. One of the known problems with SOAP messaging is that the actual stuff moves around and carries credentials, signatures, and other security things with it. In the worst cases, developers refer to the security tags as "goo" and do what they can to avoid having goo touch their stuff.

I've only addressed the tip of the iceberg when it comes to Web 3.0 security in this brief introduction. Looks to me like we have enough to keep us all duly entertained for years to come! As one of my Cigital colleagues puts it, "Web 3.0 will accelerate the de-perimeterization of enterprises, which will completely blow up the trust models inherent in enterprise application architecture today." Yay!

Acknowledgments: Special thanks to Brian Sletten of Zepheira, who actually works on Web 3.0 stuff all day and who brought me up to speed quickly. Also thanks to the Cigital Software Security Group for a stimulating internal discussion of Web 3.0 risks, which I mined for this article.