2013/3/14 Tab Atkins Jr. <jackalmage@gmail.com>:
> On Thu, Mar 14, 2013 at 1:46 PM, Mike Samuel <mikesamuel@gmail.com> wrote:
>> 2013/3/12 Tab Atkins Jr. <jackalmage@gmail.com>:
>>> Ian provided several examples of code where it seems like it would be
>>> impossible to auto-escape properly, and an author relying on
>>> auto-escaping because they've been trained that it works elsewhere
>>> could be easily misled and inadvertently cause an XSS vulnerability.
>>> Could you go over those and answer how you think your ideas for
>>> auto-escaping would address the problems he raised?
>>
>> 2013/3/12 Ian Hickson <ian@hixie.ch>:
>>> What would be autoescaped in something like:
>>>
>>> h`<img src="${scheme}://${host}:${port}/${path}/${file}.${ext}"
>>> srcset="${file1} ${w1}w ${file2} ${w2}w"
>>> alt="${alt}"
>>> data-logger-url="logger?id=${id}&key=1234">
>>>
>>> ...? (where h`` is your autoescaper; obviously pretend that part is the
>>> done however your syntax would really work, and strip newlines if
>>> necessary, obviously.)
>>
>> The parts in the src are all URI encoded. Any parts that appear after
>> a literal '?' or '#' are encoded so as to prevent parameter splitting.
>
> That implies that it's impossible to put in a url with ? or # in it, right?
Nope.
> It doesn't help the srcset at all, even though the browser knows that
> it accepts urls.
I wasn't aware that srcset was in the schema. I'll have to update to
take that into account.
> Are you claiming that literal ? or # in the data-logger-url case cause
> parameter encoding? Or were you referring solely to the src part, and
> the rest are completely unescaped?
With the heuristic that recognizes it.
>> In the closure-templates and Go versions, we have heuristics to let us
>> determine if custom attributes or data-* attributes are URL content.
>> This was based on an inspection of template code prior to the
>> introduction of contextual auto-escaping, and since Closure templates
>> are compiled statically it allows our pen-testers to keep a list of
>> known attributes that pass the heuristic and flush out new
>> non-standard attributes that don't.
>
> I doubt we want to put in heuristics for a standard escaper that looks
> for attribute values where the literal part "looks like" a url. That
> sounds extremely scary, since a relatively small change in what parts
> of the url are contained in the literal segment could potentially make
> it stop recognizing.
Again, I'm not proposing standardizing anything, so I don't know who
"we" are. Library authors can provide naming-convention heuristics as
a per-project option, and projects with a high security profile can
use pre-submit checks that flag custom elements or attribute names
that are outside their naming conventions.
>>> How about this:
>>>
>>> x`<img width="${width}"
>>> src="${profile.cgi?username=${username}&size=${width}}">
>>> <script>
>>> var x = new Image(${width});
>>> x.src = 'profile.cgi?username=${username}&size=${width}';
>>> </script>`;
>>
>> Quite. We really need an intercession layer for the DOM that lets us
>> intercept assignments to sensitive properties and do late-binding of
>> escaper to templates. Yay proxies.
>
> I don't think you understand this example properly. The template
> creates the img *and* the script. There's nothing there to late-bind.
Ah. I thought the quotes around the x.src value where `...`. In that
case, the proxy could default reject.
>>> How about:
>>>
>>> x`<p>Paste this WLAML command: AB=2%\*2*11*22;GA=${GADATA}*41</p>`
>>
>> Social engineering will affect all technical solutions as shown in
>> this E4H template
>>
>> <>{x}</>
>>
>> with
>>
>> x = "Paste this into your URL bar : javascript:pwnMe()"
>
> I believe the point here was not social engineering, but to point out
> something that is thematically similar to a URL, and that thus might
> be expected by engineers to be as "safe" as a url is (not needing
> manual escaping), when that is actually insecure. The "paste" part is
> irrelevant - just filler text in the example to introduce why there
> might be such a command put into page text.
I don't follow. How does this lead to unintended side-effects or
data-leakage without user intervention?