+++ This bug was initially created as a clone of Bug #692677 +++
In B2G, we have two scenarios currently in which this is needed
- "home screen", that loads applications installed from different URIs
- web browser UI, obviously needs to load arbitrary URLs
(The web browser needs some more things, but I'll file separate bugs for those.)
I think we could probably solve the web-browser case without needing an explicit grant of privilege. If we had an element that was
- like a XUL <browser> (i.e. ignored X-Frame-Options)
- tied to an input box ("URL bar") that could only explicitly be navigated by the user
- unlike <browser> or <iframe> couldn't be navigated programmatically (or could, but restricted to normal <iframe> security)
then I think we could build a UI around that. We could call that new element <browser> since that's unused in HTML currently. The biggest problems then would be implementing open-in-new-tab and window.open(), but I think those can be solved separately.
The "home screen" app launcher is a harder problem. To enable window-manager style features like "expose" and other animations, the home screen needs references to the app "windows". Further, it's always going to be launching apps programmatically, because it translates input like clicking on an icon into the app-launch action. So hm. I can't think of a way around an explicit permission grant for this. If we need a grant for the home screen, we might as well use that for the web browser too.

That's fine, I conceded that.
What should the API look like? Adding a new HTML element seems like overkill. In both cases what's really needed is just a way to ask that X-Frame-Options be ignored. How about a new iframe attribute? Strawman: <iframe crossOrigin>, with crossOrigin interpreted the same way as for <img>, but with an additional value: "system". crossOrigin="system" would turn off x-frame-options, and would require a permission grant from the user. That'd be a neat way to enable canvas.drawWindow() or texImage2D(window) too, which we might want to use for effects like WebGL transition animations.

Well, I suppose it could make sense to standardize APIs for privileged applications, but like we've discussed before, I think those APIs should be very clearly separated from APIs for regular Web apps. So I wouldn't want to overload an existing attribute.

I think the meaning of crossorigin on <img> is very different from what we want here. On <img> it ensures that the contents of the <img> is accessible to you by using CORS to load the data.
What we want here is to not change how the load happens at all, but rather give a permission of the data after the load is done.
Another way to look at it is that <img crossorigin> still stays within the web's normal same-origin-but-relax-if-content-producer-explicitly-allows security model. What you want here is something that requires special privileges.
We'll need something more than just ignoring x-frame-options. For example window.top should not return the top-level B2G home screen or browser-app window, it should return the .contentWindow of the <iframe iamaspecialiframe>.
There's also some things related to what <a target> are allowed to target. But this is all pretty simple stuff I think. We should have most of the plumbing in Gecko already.
And yes, unfortunately I can't think of anything that doesn't require permission granting either. Things like click-jacking means that simply embedding another website and ignoring x-frame-options comes with some risks to the user. So an explicit grant will be needed.
And to implement browsers we'll absolutely need to be able to reach in to the frame and dig out data.
For implementing browsers, do we need some way to enable a process boundary at the new <iframe>? It definitely seems like the home screen will need that, right?

(In reply to Jonas Sicking (:sicking) from comment #6)
> And to implement browsers we'll absolutely need to be able to reach in to
> the frame and dig out data.
>
Maybe, maybe not. I'm not sure yet. I'm operating under the assumption that we mostly won't. I'm hoping we can rely on gecko to do things like find-in-page, but it's a bit harder on touch screens since the find isn't initiated by a keyboard.
>
> For implementing browsers, do we need some way to enable a process boundary
> at the new <iframe>? It definitely seems like the home screen will need
> that, right?
I'm happy to let gecko make that decision; the iamaspecialiframe property could be used as a hint. This does bear on the "reach into the frame" problem though. It's not going to be possible to do that across processes. Another thing we'll need is some kind of "this page is really important and has to have low latency", e.g. for a dialer, which would be a hint to gecko to load that app in a new process. Need to make sure the critical apps can't be DoS'd.

It would make me sad for this to be whitelist only. That would pretty much kill competition around browser UIs, since in practice the user would only be able to use the one that shipped on the device. I'm less OK with whitelisting this permission than I am for something like telephony.

I don't think we can make the process separation something that is left up to gecko as like you note, it affects whether you can reach into the iframe or not.
I'm pretty sure we'll definitely need for browsers to reach into pages. I'm fairly sure that the Firefox front-end does so in lots of places. Session restore being one, addons being another.
As for security, I don't mean "whitelist" in the sense of something static and only includes for example pre-installed apps. I also think we should allow this for apps that are installed from trusted stores, where the store has indicated that the app is entrusted with some special privileges. (additionally, users should be able to explicitly grant apps this, and any other, privilege. As to prevent the Apple one-store-to-rule-them-all model)

(In reply to Jonas Sicking (:sicking) from comment #11)
> I'm pretty sure we'll definitely need for browsers to reach into pages. I'm
> fairly sure that the Firefox front-end does so in lots of places. Session
> restore being one, addons being another.
>
Huh ... what does session restore need from content? Values of input fields? The frontend needs to reach into content for the context menu; I'm not sure that's an issue for b2g. Maybe it is. Browser frontends wouldn't be able to create their own addon systems with power equivalent to classic firefox addons or jetpacks (that'd be a security nightmare anyway), but jetpack-style addons would work in b2g.
If we do need to reach into iframes, another option is for us to standardize a way of loading code into iframes and communicating with them, probably postMessage(). Something like message manager but maybe could be simpler. That sounds like it would be really hard to spec and standardize though.

> Huh ... what does session restore need from content? Values of input
> fields?
Form fields in general yeah. And scroll positions of anything scrollable.
> Browser frontends wouldn't
> be able to create their own addon systems with power equivalent to classic
> firefox addons or jetpacks (that'd be a security nightmare anyway), but
> jetpack-style addons would work in b2g.
jetpack addons can reach into page contents. And I believe that at this point all major browsers have greasemonkey-like functionality or addons.
> If we do need to reach into iframes, another option is for us to standardize
> a way of loading code into iframes and communicating with them, probably
> postMessage(). Something like message manager but maybe could be simpler.
> That sounds like it would be really hard to spec and standardize though.
We'll need something to enable us to build firefox for B2G.

Ah, I think my bug report https://bugzilla.mozilla.org/show_bug.cgi?id=698821 is a bit of a duplicate of this one. I started working on a basic browser for Gaia/B2G and have some potential use cases for this.
Events that may be needed outside the iFrame:
* Load Start - When the iFrame starts loading a new document
* Progress monitor - to monitor the progress of the document being loaded inside the iFrame
* Load Stop - When the iFrame finishes loading a new document
* Title Change - When the title of an HTML document inside the iFrame changes
Other use cases might include:
* Get Title - Get the title of the current HTML document inside the iFrame
* Stop Load - Programatically stop the iFrame from loading a document
* Suppress XFrame-Options Header - Render the contents of a document inside the iFrame, even if it was returned with an X-Frame-Options DENY or SAMEORIGIN header. (e.g. web sites like GMail which return this header to prevent phishing scams will still be rendered inside this special iFrame).
* Get favicon URL - As this can be contained inside the HTML document itself
* Get thumbnail - Get a thumbnail image of the content of the iFrame, for use in graphical window manager/tab switching type features
* Zoom - Programatically Make the iFrame content larger or smaller
In order to enable alternative browser UIs, it would seem that these permissions need to be explicitly given to certain apps by the user. Maybe it could be phrased something along the lines of "Allow this app to look inside and control other apps and web sites you use."
Note previous similar work done on Mozilla Chromeless https://github.com/mozilla/chromeless/blob/master/modules/lib/web-content.js
If B2G is to promote the *web* as an app platform rather than just Mozilla, then it does seem this would need to be standardised eventually.

The summary of the quick meeting was
(1) we need a third docshell type with some of the properties of the chrome/content boundary (X-Frame-Options is ignored, window.top doesn't walk past it), except that privileged web content can create that type of docshell.
(2) for prototyping the API we need to implement a browser, we'll have a semi-magical little wrapper script that lives in in the browser app and has access to a postMessage() port. The other end of that port will live in a chrome-privileged magic script in the <iframe magicsauce>. Frame script seems to be what we want here. The frame script will forward all the data we need to the wrapper script. When we settle on the necessary API, we'll move all this into Gecko.
We decided to punt the problem of the browser itself interacting with content, e.g. for something like session restore. We'll look at this in a v2.
jlebar signed up for (1) and is going to file a bug blocking this one.

In the first post:
> What is the use case for browser-in-a-browser?
>
> If you have a browser... then you have a browser. Why would you want to
> run another one inside your browser?
This is confusing "browser chrome" with "browser engine".
I'm a little surprised that this is so hard to swallow given that this is how the Firefox (and other Mozilla suite) UIs have worked for years and years and years.

Yeah, I think we'll have a much greater chance at getting something standardized once we have at least a demo that we can show.
I would even go as far as saying that this API is complex enough that I don't want to try to standardize anything until we know what the various requirements we have are. The risk of people saying "yeah, that looks like it should be enough" and slapping a standards stamp on it is pretty big.
Not until we have built an actual browser (ideally two, pancake on b2g would be awesome) do I think we'll have a sense for what feature set we need to expose to the browser app, vs. how much can be handled by gecko.

With apologies, I don't know what sec-review-needed means anymore. Do I need to take any action here?
None of this is exposed to the web atm -- it's exposed only to super-privileged Gaia origins. We definitely need to have a long series of conversations before exposing this to the web, but I'm not sure that now is necessarily the right time for those conversations, since the API is still in its early stages of development and we don't have a clear idea what it's going to look like in its final form. But we can have an informal/informational chat about it now, if you'd like!

For a web use case, see https://bugzilla.mozilla.org/show_bug.cgi?id=618354
To obviate the click-jacking concerns, is there any chance of switching to default browser settings which blackout cross-domain mouse coordinates when over an iframe (or at least for this new kind of iframe)?

Note

You need to
log in
before you can comment on or make changes to this bug.