https://sites.google.com/feeds/content/site/belayresearchproject2016-12-09T17:31:43.620ZPosts of Team blogGoogle Sites1https://sites.google.com/feeds/content/site/belayresearchproject/82139267303419571392012-09-04T21:27:17.143Z2012-09-04T21:27:17.147Z2012-09-04T21:27:17.139ZService Notice

We've been running the current stable code base of Belay on public servers for almost a year. That phase of the project has come to the end of its life. We'll be turning down those servers on October 1st, 2012.

If you have need of the services we are currently running beyond October 1st, you can run your own copy of the code either locally, or on your own AppEngine application. Please contact me if you need any assistance.

— Mark Lentczner (mzero@google.com)

Mark Lentcznermzero@google.comservicenotice1https://sites.google.com/feeds/content/site/belayresearchproject/88679233960427059662012-05-21T22:22:29.342Z2012-06-07T19:11:14.296Z2012-06-07T19:11:13.669ZBelay Demo at IEEE Symposium on Security and Privacy

In a previous post, we gave an overview of routing in Belay, and how we used Alloy to get a more precise understanding of what could go wrong with Belay's launch operation. In this post, we go in-depth into the steps we went through to build up the model, and elaborate on some of the choices we made along the way. If you'd like to follow along in code, you should first get Alloy, and download the example code, which has separate files for most of the steps along the way, referred to as they come up.

The World and its Parts

In the informal discussion, we used a number of terms to describe the interacting pieces of Belay's routing. We at least used the concepts of windows, the browser and Belay IFrames, instances, strings, and identities. We also assumed that only one thing was happening at a time, in a sequence of serialized, atomic steps. We need to build up these notions in Alloy's description language.

Time

We'll start our exploration by defining a notion of time. Time is a linear series of steps, which Alloy has built-in support for expressing. The Alloy incantation we'll begin with is:

open util/ordering[Time] as ord

sig Time {}

The sig declaration adds a new kind of element to the world, so Alloy will search models with elements that belong to the Time set (in models, they will appear as Time0, Time1, and so on). The incantation on the first line imposes constraints on the elements of Time, so that there is exactly one Time in the relation called first, one in the relation called last, and all other Times make a straight line from first to last via a two-place relation called next. We can ask Alloy to instantiate the model we've described with run:

run {} for 4

This tells Alloy to instantiate the world we've described so far and show us what it looks like. To run, choose "Execute"->"run $1 for 4" from the application menu. With some fiddling of visualization (I had to open up the Theme pane, uncheck "Hide private sigs/relations" and choose ord from the "Projected over..." dropdown), we get a nice picture of time (just-time.als):

Windows and Instances

Time lets us talk about the state of the world changing in discrete events. Now we need to populate the world with interesting elements. We're going to keep the model as simple as possible, and we'll see that complexity quickly emerges from even a minimal specification.

In Belay, the notion of an instance isn't precisely defined. When we say instance with respect to an application on the web, we mean some persistent collection of data and behavior associated with a user's interaction with the application. This might be a single blog, a whole email inbox, or a shopping cart. There are two important features of an instance:

It has some notion of persistent identity over time;

It knows some set of data that might include instance-specific secrets, be partially shared with other instances, and change over time.

We abstract away all other details, and merely focus on these two features in our model:

sig Token {}

sig Instance {

identity: Token,

knownTokens: Token set->set Time

}

We choose to represent all data with a single kind of element, Token. This is an out-of-model description: We really mean that all data, all strings, all secretly-generated UUIDs, all JSON strings, are represented by sets of Tokens in the model. This uniformity is both coarse and useful. It will be difficult to miss a kind leak of information if all information can flow over every channel and be stored everywhere!

The definition of Instance, then, has two Token-based pieces. An Instance has a single identity, for all Time, there is a single piece of data that describes its identity. This is our model of the persistence of (real-world) instances. Along with its identity, an instance knows some set of other information. The information the instance knows isn't the same at all times, however; it can change over time. The declaration

knownTokens: Token set->set Time

says that an instances knownTokens are a many-to-many mapping between Tokens and Times. That is, for a given token, an instance might know that token at many different times. The inverse is true as well: for a given time, an instance might know a number of different tokens.

Given this definition, we can ask Alloy to instantiate examples for us:

run { } for 3 Time, exactly 1 Instance, exactly 2 Token

(tokens-and-instances-withfact.als) (Note: You might not get this example as the first one when you click "Execute". Try clicking "Next" in the toolbar to explore multiple examples.)

In this instantiation, we've told Alloy to only generate a single Instance, which is in the oval above. Its identity is Token1, but the set of Tokens that it knows changes over time. At Time0 and Time1 it knows about Token0, but at Time2 it doesn't, and has "forgotten" about Token0. This could be a model of a instance that allowed the user to delete some data between Time1 and Time2, for example. The Next button at the top of the frame can be used to explore other instantiations, as this isn't the only instantiation for 3 Time steps, one Instance, and two Tokens.

There's something a little odd about this instantiation, though. According to knownTokens, the instance isn't aware of its own identity at any time. Instances often need to report their identity in order to grant BCAPs that point back to them. Indeed, an instance is the only entity that actually knows its true identity. We enforce this by adding a constraint to our model:

fact IdentityIsAlwaysKnown {

all i:Instance, t:Time | i.identity in i.knownTokens.t

}

This constrains instances in our model to always know their identity at all times. Another way to think about it is that this fact tells Alloy that we aren't interested in exploring instantiations where instances don't know their own identity. Framing it that way makes it clear that facts are a little scary: they can cause Alloy to miss interesting examples if we over-constrain the model. Any fact we add should be carefully scrutinized to make sure that it only excludes truly nonsensical instantiations. If we run again, with this fact added, we get a different picture:

In this instantiation, the Instance always knows Token1, which is its identity. It happens that at Time2, it also knows Token0 which we can think of it "learning" somehow between Time1 and Time2.

The arrows can get a bit overwhelming in these pictures. We can also project the visualization over Time (the "Projected over ord/Ord" is a drop-down), and step through one Time at a time, using the << and >> buttons at the bottom. Here's the same instantiation in that view:

Instances look reasonable for now, so we can move on to thinking about windows in the browser. Windows actually have rich and complex behavior in browsers, with respect to navigation, events, the same origin policy, including IFrames within them, and so on. We are going to boil all of those details down to almost nothing, and say that we only care about two features for a window:

Whether or not the window is open (if it exists at a given point in time),

What instance is loaded into the window.

In Alloy, we write:

sig Window {

exists: set Time,

atInstance: Instance lone->set Time

}

A window might exist at several different points in time (it doesn't just flash into and out of existence), and it may have a particular Instance loaded into it at certain points in time. It doesn't ever have multiple instances loaded at the same time. Both of these assumptions are quite simplistic, and we might want to consider relaxing them. For example, a window might have an iframe that is running a different instance than the master instance, and this might confuse part of the routing fabric. We acknowledge our simplistic assumptions here. Instantiating the model with some windows yields interesting results (with-windows.als):

The Browser and Routing

Finally, we need to describe the Belay browser. We don't get into the details of the iframe implementation of routing here, and assume that Belay-style routing is built-in to the browser. That is, the browser itself keeps track of the map from tokens to windows:

one sig Browser {

routeMap: (Token lone->lone Window) set->set Time

}

The "one sig" says that there is only ever one Browser in an instance. We don't consider any multi-browser communication here. The routeMap in the browser, at any given time, may have many functional mappings from Tokens to Windows. Further, the same mapping is free to persist across multiple Times.

If we start to instantiate this model, we start to get some pretty weird behaviors. The browser adds and removes mappings from its routeMap arbitrarily, Windows jump to difference instances, and Instances gain and lose arbitrary information. We could try to constrain all this behavior with facts, but instead, we will take a more principled approach to describing the behavior of a browser over time.

State Transitions

We are going to model the browser as a state machine that nondeterministically makes one of several transitions at every time step. If the browser starts in some acceptable initial state, each step should take the browser to some acceptable next state. If it doesn't, and the transition is an accurate model of our implementation, then there is something we need to pay attention to in Belay's routing.

Initialization

To initialize the browser, we'll start with it in a "just opened" state. No windows are open, no routes are mapped, and no instances are loaded (init-example.als):

Note that we don't constrain the shape of Instances in an initial configuration (beyond the fact we defined above). They are free to know any set of Tokens, and have any Token as their identity. These sets might overlap or be shared between Instances, as they are in the above instantiation. In fact, if we were willing to be more precise, we might be able to be less draconian with our model of browser initialization. In this simple model, we start with a clean slate for the browser, and acknowledge that a richer model might help us explore more scenarios.

Transitions

Given an initial state, we should now define how we can update the state. Traditionally, Alloy models describe state update with a predicate of (at least) two places, describing the state before and after the update. This is the approach we take, and our pre- and post- states are both elements of Time. We will consider three kinds of transitions:

A trusted instance launch, such as when Belay launches Station from trusted local storage;

An untrusted launch, when one window requests that Belay launch an instance in another window;

A capability invocation, where one instance passes information to an instance in another window via Belay by providing that window's instance id.

Trusted Launch: We'll start at the beginning, and describe the time before and after a trusted launch operation. Before diving in to code, we should think about what a trusted launch is:

Belay knows the identity of whatever trusted instance it is launching,

Belay can update its routeMap correctly for later invocations,

After launching, this addition should be the only change to the routeMap,

After launching, there should be only one additional Window open, which has the Instance in question loaded,

After launching, no Instance should know any more Tokens than it did before the launch.

These intuitions can be translated almost directly into code:

pred launchTrustedIdentity[t, t': Time, w: Window, i: Instance] {

t' = t.next

// this window has never been open before

not (w in exists.(t.*prev))

// this window is the only new open window, all others are open still

exists.t' = (exists.t + w)

// this window holds instance i right now, others are the same

atInstance.t' = (atInstance.t + w->i)

// identity isn't mapped to already (don't open same Instance twice!)

not (i.identity in Browser.routeMap.(t.*prev).Window)

// consistent knownTokens

knownTokens.t' = knownTokens.t

// this instance is now mapped to by its identity

Browser.routeMap.t' = (Browser.routeMap.t + (i.identity)->w)

}

Here's what a such a launch looks like, if we start from an good initial state, ensured by init[t] (launch-example.als):

run {

some t, t': Time, w: Window, i: Instance {

init[t] && launchTrustedIdentity[t, t', w, i]

}

} for 2 Time, exactly 1 Instance, exactly 1 Token, exactly 1 Window

Initially, no windows exist, and the only Instance only has it's identity as a knownToken. The launch adds the mapping to the routeMap, and springs the Window into existence with the Instance loaded. The launch is trusted because we assume that the browser has some way to get the identity "right", without being informed by one of the windows or instances.

Checking, not Constraining: An interesting thought comes up here relative to one of our facts from earlier. We explicitly stated, via a fact constraint, that Windows couldn't be at an Instance unless they existed, like so:

fact InstancesVisitedByExistingWindows {

all w:Window, t:Time | some w.atInstance.t => w in exists.t

}

If we're switching to a model where we talk about models that satisfy state transitions, this seems less like a fact about the world, and more like something we'd like to verify of our state transitions. That is, we should check, not assume, that InstancesVisitedByExistingWindows holds after each transition. We can use Alloy to describe what a trace of events looks like:

pred trace[] {

init[first]

all t:Time - last {

(some w:Window, i:Instance | launchTrustedIdentity[t, t.next, w, i])

}

}

The trace[] predicate says that at the beginning of time, init is satisfied, and at all subsequent times, the state update is the result of a trusted launch. Given this, we can take our fact from above, and turn it into an assertion about traces:

assert InstancesVisitedByExistingWindows {

trace[] =>

all w:Window, t:Time | some w.atInstance.t => w in exists.t

}

check InstancesVisitedByExistingWindows for 4

No counterexample found. Assertion may be valid. 193ms.

Now, we've removed the constraint on all Windows from before, and changed it into a property we can check of our state transition model. This is a great improvement: we've taken one of our assumptions and turned it into something checkable. Even better, we can continue to check this assertion as a sanity check against any new transitions we add, to help ensure that we don't send our model off the rails anywhere.

There are other useful things to assert here that are more specific to launchTrustedIdentity, as well. For example, it should be the case that all trusted launches grow the routeMap by exactly one entry:

assert TrustedLaunchGrowsRouteMapByOne {

all t, t': Time, w: Window, i: Instance {

launchTrustedIdentity[t, t', w, i] =>

#(Browser.routeMap.t') = #(Browser.routeMap.t) + 1

}

}

check TrustedLaunchGrowsRouteMapByOne for 5

No counterexample found. Assertion may be valid. 0ms.

Checking this also results in no counterexamples. Note that there is no requirement that trace[] hold here; this is a more general assertion over all trusted launch events. Writing these assertions is like unit-testing the specification: they aren't the end result we're aiming for, but we can sanity-check the implementation by making sure they are never violated. These checks, and this implementation of trace[], are in trace-and-test.als.

Invocation: Next, we need a definition of invocation between instances. In an invocation, an instance decides to pass some information, in the form of Tokens, to some other instance. In a client-side invocation, this happens between windows, so an instance can try to send an invocation to any window for which it has the correct Token in the routeMap of the browser. The invocation adds the arguments to the knownTokens of the instance in the targeted window, and doesn't change anything else:

Now we've added a new kind of state transition, so we need to update trace[] to allow invokes into the trace. To add invoke to the trace, we existentially quantify (using some), over the non-Time arguments to invoke, and add it as a case to trace[]:

Alloy is having trouble checking a quantification over a set, due to something called "skolemization." This technique relies on the finite nature of Alloy models to handle quantifications. For example, consider an instantiation bounded at 3 Tokens: T1, T2, and T3. A quantification of

some t:Token | somePredicate[t]

Really means

some t:{T1, T2, T3} | somePredicate[t]

Which is equivalent to:

somePredicate[T1] or somePredicate[T2] or somePredicate[T3]

This expansion is known as skolemization (this example is derived from page 154-155 of Jackson's excellent book on modeling with Alloy). This doesn't work so well for set constraints, evidently, though I'm not completely sure why. I would guess that it's because Alloy would need to generate powerset-like expansion:

In any event, there are several solutions that we don't address directly here. For this exposition, it is enough to consider single-argument invocations, where we simply drop the set constraint, and allow for only one args Token:

The state is exactly the same at Time2 and Time3! The same Token filled all three roles: It was the target of the invocation, the identity of the only instance, the capability that was invoked, and the argument passed for the invocation. This actually could happen in the real system, when an Instance invokes one of its own capabilities. So this is a perfectly valid instantiation. If we use "Next" to find more examples, we get what we came for:

In this example, the extra arrow from Instance3 to Token1 appears, which is exactly the argument that was passed from Instance0 in Window1 to Instance3 in Window0. Up to here is included in invoke-example.als.

General Launching:

Now we need to model launches that aren't the trusted launch from before, where the launcher gets to decide on the entry in the routeMap for the new window. Aside from this change, this launch is almost identical to the launch from before (with the appropriate update to trace[]):

Asserting A Property

Now we can try to state the property that we want to hold of invocations. Our assumption is that the targetIdentity of an invocation identifies the Instance that the cap and args of the invocation are meant for. Crucially, the cap and args aren't meant for any other Instance. To model this, we write an assertion that states that after an invocation, the targeted Instance is the only one that has access to the cap and args that didn't already:

The statement is a little complicated, so let's insert some English. The wrapping of "trace => all t, t'..." is just saying that we're only interested in this property for valid traces, and we're interested in all invocations in valid traces. The inner part requires more unpacking:

(all i: Instance { // for all the instances

i in knownTokens.t'.(cap + args) => // IF the instance is aware of anything in the cap or args AFTER the invoke

(i in knownTokens.t.(cap + args) or i.identity = targetIdentity)

// THEN that instance had better have known it before, or be the targetIdentity

})

If we check this with Alloy, we get a counterexample (as we expected). We start with two windows open, and we can see the problem lurking: Window0 is mapped to by Token3 in routeMap, but the Instance it holds has identity Token 0. This example is in (counterexample.als).

Indeed, in the next step, Instance2 mistakenly receives Token3, which Window1 intended to go to the Instance with identity Token3. We know this because of the $NoBadInvokes_targetIdentity annotation on Token3, which is Alloy annotating the elements of the visualization with their role in the counterexample. The invocation was mis-routed to Instance2, which learned something it wasn't supposed to, because it was opened with the wrong mapping in routeMap.

We seem to have found our bug! But we should still be skeptical. The statement of NoBadInvokes is awfully complicated, even though it gave us what looks like a reasonable counterexample. We weren't satisfied, so we came up with a different way to say the same thing:

Checking this yields a nearly identical counterexample. But were our statements equivalent? We can check that, too! We wrap both asserts in predicates, and check that they hold in exactly the same situations:

pred NoBadInvokesP[] {

trace =>

... // as before

}

assert NoBadInvokes {

NoBadInvokesP

}

pred NoBadInvokes2P[] {

trace =>

... // as before

}

assert NoBadInvokes2 {

NoBadInvokes2P

}

assert BadInvokesSame {

NoBadInvokesP <=> NoBadInvokes2P

}

check BadInvokesSame for 5

>> Executing "Check BadInvokesSame for 5"

Solver=sat4j Bitwidth=0 MaxSeq=0 SkolemDepth=4 Symmetry=20

188404 vars. 440 primary vars. 311052 clauses. 25157ms.

No counterexample found. Assertion may be valid. 490032ms.

Note that this took a really long time (nearly 10 minutes)! The iff operation is quite expensive, which probably has something to do with the exhaustiveness of the property. But, we've stated the property twice, and both have the exact same meaning (up to models of size 5). The whole model is in full-example.als.

We've gone through the process of modeling our error and convincing ourselves that our model correctly represents it. In a future post, we'll go through the details of both the implementation and model of our fix, and what properties the fix satisfies.

The Belay project has enabled new kinds of window-to-window communication within browsers. We've made this possible by building a mechanism for sending messages between potentially mistrusting windows at different origins.

Routing in Belay

One of the main jobs of Belay's browser presence is to define the invocation of BCAPs granted by a JavaScript CapServer in a browser window. This client-to-client case differs from other typical scenarios of invocation: for server-to-server or browser-to-server communication, invocation maps easily onto HTTPS, for example. This is because server-granted BCAP URLs specify a domain name, which is resolvable by the local DNS service, and is globally unique, because of how the space of domain names is set up. In order to have invocations work between windows in the browser, we need the same properties as for HTTPS URLs:

A notion of persistent, globally unique names for browser windows.

An analog of DNS resolving for browser windows.

Routing by Assigned Identities

For (2), we have the Belay process (currently, the Belay IFrame) keeps a map from strings to windows. The strings represent the identity of the CapServer running in the page. CapServers on pages are informed of their entry in the map, and can mint new BCAPs that refer to these identities (the analog of minting a BCAP that refers to a web server by domain name). These BCAP URLs have a slightly different scheme, shown with the identity part in bold:

urn:x-cap:b78a62b1-801a-492e:a0db1c00-dd41-46ef

To invoke such a BCAP, the page passes the URL and the serialized arguments to the Belay IFrame, which uses its map of identities to route the request to the right page, and return any response back to the invoker.

Assigning Identities

This solution assumes that identities for the windows exist, and have been communicated to the Belay IFrame. But browser windows don't really come with a good notion of identity, so solving (1) is a little more subtle. In our initial design, we decided to invent identities for windows by making the window's opener responsible: When a new instance is launched in a new window, the whatever page initiated the launch picks a string to use as the identity of that window, and communicates it to the Belay process.

The launcher might pick this string completely at random, perhaps for a window that will hold a page it has never seen before. Or, it might pick the string to provide some persistence, as in the case of Belay Station re-launching an instance. The instance may have granted some BCAPs in the past, using a particular instance id, and the only way for it to continue serving those requests on the next launch is for the launcher to associate it with the same instance id. For example, we want to support the following:

A Buzzer instance is created and launched with identityb022

The instance grants a posting BCAP using that identity, the BCAP looks like: urn:x-cap:b022:a0db

That BCAP is handed to an Emote instance

The windows holding the Buzzer and the Emote instance are closed, with the launch BCAPs for the instances saved to the user's Station.

The next day, the user opens both the Buzzer and Emote instance from their Station.

The Emote instance invokes the posting cap urn:x-cap:b022:a0db, which should be routed to the window the Buzzer instance has been reopened to.

At step 5, Station (the launcher) needs to provide the identity b022 to the Buzzer instance's window, otherwise the requests will not be successfully routed during the invoke in step 6.

What Could (Possibly) Go Wrong?

We just described a mechanism for launching windows that gives their content some persistence across sessions. This lets Belay's notion of an "instance", a persistent, application-defined set of data and behavior, exist on top of browser-specific concerns like windows and data-only communication channels. But does it really work? Will BCAPs always be routed to the "right" instance? Can any page launch new instances, or is that a special, Station-only behavior? Could there be a mimicry attack, where one instance masquerades as another?

Potential for Sadness

Based on the our implementation above, we were pretty sure that launch was a highly-trusted operation. A malicious or mistaken launch could assign an instance's identity to route to a window running a completely different instance. For example, take a variation on the example above:

A Buzzer instance is created and launched with identityb022

Another Buzzer instance is created and launched with identitya955

The first instance grants a posting BCAP using its assigned identity; the BCAP looks like: urn:x-cap:b022:a0db

That BCAP is handed to an Emote instance

The windows holding the Buzzer and the Emote instance are closed, with the launch BCAPs for the instances saved to the user's Station.

The next day, the user opens Buzzer a955 and the Emote instance from their Station, but Station accidentally assigns identity b022 to the Buzzer's window.

The Emote instance invokes the posting cap urn:x-cap:b022:a0db, which posts to the wrong Buzzer.

This is a case where Station makes a mistake, but exposing the launch operation to any page would allow malicious pages to deliberately launch instances that impersonate others.

We realized this weakness existed through Conversation and Thinking Real Hard(CaTRH). While we consider ourselves seasoned practitioners of CaTRH, it does make our heads hurt sometimes, and we'd prefer to resort to it less often. We don't know what other weaknesses are lurking in the system that we've missed, and we could use some help gaining confidence in their absence. Plus, we'd like to free up our valuable Conversation time for more engaging diversions.

Reducing CaTRH's Role

Alloy is a language and tool for exploring relational models. Given a relational specification of a system, it can generate instantiations of the system, visualize the instantiations, and check properties of the instantiations. If we can describe Belay's routing fabric in an Alloy model, we can give a more formal semantics to our intuitions generated from CaTHR. The rest of this post describes how we used Alloy to crisply model the weakness we described informally above, and verify a solution. We provide a much more detailed account of the Alloy model in a separate post, and only provide a brief overview here.

The world of our launch-and-invoke model is made up of 5 kinds of things: the Browser,Windows, Instances, Tokens, and Time. A Token is a model of any and all serialized data, from UUIDs to URLs to postMessaged strings. A Window holds a currently-loaded Instance, and can hold a different Instance at any point in Time. An Instance has an identity represented by a Token, and also is aware of a set of knownTokens. The Browser keeps track of a mapping from Tokens to open Windows, in order to route invocations to the right Window.

Why Launch Must Be Trusted, Formally

We use Alloy to model the launching operation we described above, where the launcher decides which identity (modeled as a Token), is used to route to the newly-opened Window. We also define invocation, which allows an invoker Instance to pass any Tokens that it knows about to any other Instance open in Window for which the invoker knows the entry in the browser's routing map. If the routing fabric is working correctly, then after an invocation, only the Instance whose identity is the Token the invoker chose to route to should know the passed Tokens. It would be an error to pass the arguments to a different instance, for example, which is the bug in step 7 of the mistaken Station above.

This is a lot of code (which is built up more slowly in the related post). The gist is that an invoke is described by a pre and post state (t and t'); an invoker and a target window; a targetIdentity, which should route to the target window; and a cap and some args, which are the Tokens to pass to the target. The invariant described by NoBadInvokes is that after every invocation, the instances with each of the argument Tokens in their knownTokens is either the same as before, or the same as before plus the instance with the identity of targetIdentity (the purported target instance of the invocation). This says that no instance other than the targeted one should learn the arguments.

When we run the check command, Alloy immediately reports a counterexample. We'll go through it step by step.

At Time1, there is a single window open, which the Browser maps to via Token1 (shown as the orange arrow labeled routeMap[Token1]). This window, Window1, is currently navigated to Instance1, which has the identity of Token1. The brown arrows represent the Tokens the instance knows---the set of knownTokens---which aren't labeled merely to avoid cluttering the example. There are two other instances in the world, Instance1 and Instance2, which aren't loaded in any browser window.

If we step forward in time, we see that a new window is launched:

Now there is a new window, Window0, open, which is navigated to Instance2. Instance2 has identity Token0, but its enclosing window, Window0, is mapped to by Token3 in routeMap. This is the wrong identity for Instance2. Also note that at this time, the only Token that Instance2 knows of is Token0. If we step forward in the counterexample one more step, we can see what changes:

Here, we see that Instance2 now knows about Token3, after an invocation occurred. But this shouldn't be possible---Instance1, the instance in the invoking window, doesn't even know Token0, the identity of Instance2. It "tried" to invoke a capability at the instance with identity Token3 (we know this because of the $NoBadInvokes_targetIdentity annotation), but the invocation was merely routed to the window mapped to by that token, with no guarantee that the instance in the window actually had that identity. This potential for misassignment of identity in routing is why launching must be treated as a trusted operation: malicious or buggy launchers can break important invariants of the system.

Allowing Untrusted Launches

We'd like to provide the launch operation to instances other than Station (or other trusted parties). It's just plain useful; some sites might have built-in mini-stations that keep track of related sub-instances, like launching your Calendar from Gmail. Others might want to turn middle-click (open-in-new-window), into a launch operation to support common web gestures. Relegating the launch operation to a trusted whitelist of instances simply won't always work.

We have a (CaTRH-derived) proposed solution. Instances will keep track of a secret that represents their identity. There will be a global hash function for creating routing identities from secret identities. The hash of the instance's secret identity can be freely shared with other instances, so they can invoke capabilities at the instance, or even launch it. Upon launch, an instance must provide it's secret identity to the trusted browser architecture. The browser can then apply the hashing function against the purported routing identity of the instance, and ensure that the launch is legitimate.

We change our implementation of the model, make a (slight) change to our definition of NoBadInvokes based on the change, and re-check. Alloy doesn't report any counterexamples:

In a future, longer post, we'll explain the updated model, and a discussion of why this helps us justify our fix. The related tutorial post only gets up to the point of explaining the counterexample above.

The Payoff

To understand what the model is doing for us, consider what we'd have done in a world without our Alloy excursion. We could have built a proof-of-concept site that abused an open implementation of launch. Then, we could have implemented the fix, and watched our proof-of-concept break. We wouldn't have known if another malicious site could be implemented against our fix, or if our idea was even worth the effort of implementing. We wouldn't know if the bad behavior was due to a simple bug in our implementation, or if we really were implementing a flawed idea.

With the model in hand, we know what the problem was, and that our fix is at least conceptually sound. We still have to implement it correctly, but at least we know that what we're trying to implement is consistent with our goals. Now we've:

Understood why our intuition that launch should only be provided to trusted parties was correct;

Crisply defined what a security failure meant in terms of information reaching the wrong party;

Formally specified our fix on top of the old system;

Gained confidence that the fix doesn't exhibit the same weakness.

Moving forward, we'll build more models, extend existing ones, and address more properties than the straightforward one we've outlined here. We're excited about increasing our own understanding of Belay, and also for the opportunity to communicate crisp specifications of Belay's design and techniques.

In the fall after their internship, Joe & Arjun (along with a Brown Master's student, Matt Carroll) wrote a few applications on top of Belay and its principles that saw some real use. As of yesterday, they're all open source (with a corp-friendly Apache license). The relevant repository is on Github:

The app that saw the most use was Resume, which Brown Computer Science used to conduct their faculty recruitment process this year. It saw more than a thousand unique users in a span of a few months. It was used by hundreds of applicants, yet more hundreds of reference-letter writers, and the faculty and staff of the department to review the applicants. A brief blog post outlines some of our experiences. We also built Continue, a conference management application, which hasn't yet seen real use.

Implementing these applications, and putting them in front of real users (on real browsers!), led us to extend Belay's libraries with a few extra features. Since a Belay post wouldn't be complete without some references to code, links to the implementations of some of the features we worked through are inline below.

Cryptographic capabilities are useful, because they avoid the database inserts of the grants that the existing Belay libraries do. We implemented a capability server that can grant capabilities with their private data cryptographically encoded in the BCAP URL.

Some of these features may find their way into the mainline of Belay in the coming months. The broader message is that Belay can be used in real application development, and its libraries are flexible and extensible. Belay is for building real-world, user-facing, secure applications, and will only become more robust and useful as we keep at it.

jpolitz@google.comopen-sourcebelayapplications5https://sites.google.com/feeds/content/site/belayresearchproject/69245545228437658892012-02-22T00:48:49.248Z2012-02-22T00:48:49.254Z2012-02-22T00:48:49.243ZNew material and a poster

I've put up five new pages of introductory information about Belay. See the links under each of the six sections on the home page.

We also have a spiffy new poster. The full version is available on the materials page, preview here:

The Belay team is happy to announce that the recent technology preview release of Dart includes a port of the Belay sample application "Buzzer". We ported the front end, back end server, and the BCAP libraries to Dart. The resulting web application interacts with rest of the Belay infrastructure, including exchanging BCAPs with JavaScript written code.

The samples/belay/README in the Dart code base has instructions for getting it all running and trying it. At present, we admit those are a little cumbersome, as it requires getting the Belay infrastructure running on your machine as well. Look for an update on the Dart port that will make it easier in a week or so.

Mark Lentcznermzero@google.comwelcomedartexplorers1https://sites.google.com/feeds/content/site/belayresearchproject/44571979833626905612011-07-28T18:01:56.809Z2011-07-28T18:12:53.884Z2011-07-28T18:12:53.867ZDemo and Talk at Mozilla

We gave a demo and talk at Mozilla earlier this week. This is the best version of the demo yet (if we do say so ourselves), and features creating custom pages on a web site without account names, passwords, or cookies! The slides are attached to this post, and the video of the talk is now available, curtesy Air Mozilla. Video timeline for those who want to skip around:

00:00 Introduction and background to the project

06:20 Demo of running system

21:00 Technical description of how parts of the demo work (with lots of code!)

We're excited to announce that our code base is now open on code.google.com: Visit google-belay.

This is the live repo that the team is coding in. It works, but we've made it "one-click" installable or runnable. In particular, you'll need the AppEngine Python SDK, and you'll need to be comfortable setting up local development instances with it. See our admittedly terse README.txt file.

If you're pulling from head, you'll see everything we're doing. If you want to pull it and play around with it yourself, we suggest pulling tag v0.2. You can grab it from the downloads section, or from the repo directly: