[meta] A secure theme system (with twig)

Makes Drupal less vulnerable to abuse or misuse. Note, this is the preferred tag, though the Security tag has a large body of issues tagged to it.Do NOT publicly disclose security vulnerabilities; contact the security team instead.

Anyone (whether security team or not) can apply this tag to security improvements that do not directly present a vulnerability e.g. hardening an API to add filtering to reduce a common mistake in contributed modules.

A note in advance: Even though this includes some specifics of Twig and the idea is to implement it with Twig, most of the ideas are written in abstract form with an hypothetical "secure" phptemplate engine in mind, so that the whole community can participate in the discussion.

Motivation

The past has shown that custom themes often were prone to XSS (Cross Site Scripting), where user input has not been filtered properly.

Often this can be as easy as the following line in node.tpl.php:

echo $node->title;

which seems totally harmless (at least for those not reading the docs properly).

But also more "interesting" constructs have been found in the wild like full MySQL queries, function calls, etc. inside templates that exposed raw data and such an attacker can evaluate custom JS. An easy way to test this is setting for example title to <script>alert('xss');</script>.

Approach

One solution to this never ending spree of security problems is sandboxing and auto escaping. In sandboxing the allowed PHP commands are restricted, in auto escaping everything that is output is automatically escaped (think of each string being run through check_plain() before it is output).

Twig is a system, which allows compiling templates from a simple markup language to PHP. Therefore twig templates can be restricted in what functions are callable. Twig also allows auto escaping of output.

Problems

So restrict function set and enable auto escaping and all is well? Unfortunately not quite in the Drupal World.
(There is another problem that auto escaping needs to know the context for the proper escaping strategy, but this is for now not relevant to this issue.)

and now lets hypothetically think of that all echo's in the node.tpl.php are auto-escaped with: echo check_plain(x):
(this is essentially what twig is doing, but we simplify it here)

<h2 class="title"><?php echo check_plain($label);

?>

We end up with a double escaped title, which leads to for example the output of "Drupal & Friends" to appear like&quot;Drupal &amp; Friends&quot;.

Solutions

To solve this problem we have several possible solutions, but all have different drawbacks and advantages and here I need community input:

In Twig it is possible to mark "output" as safe by wrapping it as part of a Twig_Markup object. Lets assume we had such a concept in Drupal, which is called SafeString and that we have a function to mark a string safe and check if a string is safe:

drupal_mark_safe() / drupal_is_safe()

Lets further assume for a moment that we have a function called drupal_autoescape(), which is aware of that and does not double escape already safe strings, but rather returns them as is:

Third - The engine guesses safe strings: Mark all strings found in $variables safe for an auto escape aware engine
(this can also happen on run-time depending on the engine)

First - escape functions return safe strings (SafeString)

The first is the most secure, but also the most complex variant ( implemented in #1712444: Twig: Activate autoescape mode as proof of concept). All strings that are escaped in Drupal are then objects of instance SafeString. Such calls to is_string are failing, while is_object calls can be misleading.

The advantage of the first variant is that besides fixing tests and code, the effort is rather simple as only all escaping functions need to return "SafeString"s. Also it is secure, because only a string that has been escaped, is not escaped again.

Second - Modules need to mark escaped strings as safe manually

The second is less secure than the first as it is more prone to errors like "I totally escaped that string in the other function. That is safe here.", but it still protects template authors of unintentionally leaking data like: echo $node->title; (as it is wrapped in drupal_autoescape). And even if core or contrib accidentally exposes an insecure variable like $title to the template ([#bartik-prone-to-xss]) as long as they did not deliberately marked it safe, it is still safe and auto escaped properly.

The disadvantages are (besides being not totally secure) that all output needs to be marked safe manually and that means changing all preprocess / process functions. While in the first case escaped, but not marked safe strings could not happen, here the template is broken and returns double escaped output till it gets fixed and marks the strings safe correctly. However as all templates are changed anyway as part of the Twig effort this might be a good opportunity to make them safe at the same time.

The advantage is that there are no new side effects of calling escaping functions like check_plain() as they return strings like normal and only templates need to deal with the (printable) SafeString objects. And it prevents a series of accidentally forgotten escaped variables.

Third - The engine guesses safe strings

In the third variant the auto escape able engine marks all strings in variables with drupal_mark_safe before passing them to the template.
(This makes sense as template authors are encouraged to use these variables as safe strings already now in templates.)

Therefore, in this third variant the theme author is only a little more protected than he is now: First, he cannot call insecure functions and second, code like echo $node->title; would still be detected as being insecure. However code like echo $title; would be not if title was insecure (like in the example in the second variant), because it was marked as safe.

Advantages are ease of implementation and less code changes. Disadvantages are the reduced security and less protected variables, which might give a false sense of security.

Refinements

There are some further issues that need to be taken into account:

TypedData Interface and SafeStrings

Using Twig_Markup objects for safe strings was objected, it was proposed to use the TypedData Interface instead and provide a SafeString extends String. However SafeString would need to implement __toString() method to work efficiently and I am not sure if that is "allowed" as part of TypedData Interface. A possibility could be to use a class SafeStringMarkup, which further extends SafeString and then provides the __toString method. Thoughts are welcome here.

Render Arrays / Output from render() / drupal_render()

node.tpl.php has for example

print render($content);

which auto escape would transform into

print drupal_autoescape(render($content));

Such we would need to define that output from render arrays is secure, which means that

a) render() / drupal_render() is only concat-ing safe strings in the first place
b) everything that is returned from render() is marked with drupal_mark_safe()

Given that render in essence calls theme() and template wrappers and adds #suffix / #prefix (and of course lots of nice logic), this seems reasonable, but it of course introduces possibilities for module authors and themers that change template.php to introduce XSS accidentally.

Overall however this just means that the security for these parts is not better than what we have now, but it is still note worthy.

API - Public API for these functions

I now proposed drupal_mark_safe() / drupal_is_safe() as easy to understand interface. However for performance reasons a check for $str instanceof SafeString would probably be used instead.

Drawbacks

The biggest drawback of any auto-escape solution is performance, but benchmarks in twig_engine and twig-in-core patches show that an additional function call (auto escape) doing "instanceof" and wrapping within an object for marking strings safe is really fast and that the overhead of marking something as secure is minimal compared to the overhead of escaping itself (which we _need_ to do).

Summary

I personally would love to see a secure theme system in Drupal 8. There are certain challenges to overcome. I think we can overcome them and have a much more secure theming layer by default.

I would love if all escaped strings in Drupal 8 were typed and Safe (Solution 1), but I see that this is really complex.

My second preference is Solution 2, where there is more work for the module author and/or themer, but the effort is directly paid out in security.

I will minimally integrate Solution 3 into Twig as that has no side-effects and if a string needs to be output it can still be marked safe or raw as part of the template (that is always possible like {{ obj.myproperty|raw }} in Twig or echo drupal_mark_safe($obj->myproperty); in PHP.

I think it might be a good idea to start with 3, then move to either 2 or 1 as follow-up.

Please keep this focused to the topic on hand, because this is not necessarily about Twig - it applies to any template language doing auto escape / sandboxing. And please do not discuss the advanced topic of that auto escape for HTML is wrong in a JS or CSS context, etc. This is all known and really complex but it is not within the scope of this issue to solve (see http://groups.drupal.org/node/255293). Thank you!

A few important things to consider: wholesale autoescaping all user input is not a good idea for sure because it can be rich text which is made safe elsewhere. No, the role of autoescaping and everything Twig in security is twofold: remove PHP from templates and put the onus of security on people who write PHP.

That said. While 1. is a very alluring option I started implementing it's a nightmare in reality because of strings not being strings. The complexity and side effects are really frightening.

Nowadays we already presume that variables provided for a template are made safe previously, somewhere. Some are check_plain'd in preprocess, some come from t(), some come from l(), some come from drupal_render (which is presumed to return safe things). If all of them are safe anyways, then why bother manually? That's not nice a DX.

So, my vote is on 3. It's the simplest, has very few if any side effects because it operates within the boundaries of the template system and still puts the kibosh on making XSS holes via {{ node.title }}.

#3 seems a little iffy because as soon as you have a scenario where a themer wants to print something like $node->field_whatever[0]['safe'] (which contains safe HTML), the auto-escaping will come along and break that, so they'll have to explicitly mark it as safe instead? And I think the security benefit of this becomes questionable as soon as you start having things like {{ variable|raw }} all over the templates... because then people start learning that it's a good idea to use that everywhere whenever they run into a problem.

#1 reminds me of Taint PHP (http://jaspan.com/tainted-love). I've always wondered if there was a way to do that without modifying PHP itself. But treating strings as objects definitely seems pretty iffy... I can actually imagine some ways to do it without that (keeping a registry of already-escaped strings or their hashes somewhere in memory and checking against that before auto-escaping?); the performance or memory implications of that might not be good, but I wonder if anyone has tested it?

So, what if we do $GLOBALS['safestrings'][$string] = TRUE to mark safe strings? I know we hate globals but this is the perfect case for it -- if a string is safe, it is. That won't change, ever. As for testability, it's still testable. The performance implication are very likely to be super small, there's this assignement and there's one isset in Twig when wanting to escape. Looks doable to me. We need to memory benchmark this. My gut feeling is that even with the inefficiency of PHP storing things this will not be very horrific. I quickly tested it and 10 000 random strings a total of 32 912 113 characters long stored as shown above took 33 822 512 bytes. Of course it can't be known without actually testing this but if we presume that the sanitized strings total size are comparable to the HTML source size (that stands to reason if we do not produce tons of sanitized strings which arent printed) then, well, I looked at a 300 comment issue and that's 438 574 bytes. All in all, it stands to reason we are looking at worst a .5MB memory and negligible performance loss.

@spekkionu: Interesting point. However, that wouldn't help with all cases because in addition to check_plain() followed by an (auto-escaped) check_plain(), you can also having something like filter_xss_admin() (which produces a safe string but one that is expected to contain HTML), and in that case we still need to prevent the auto-escaping code from calling check_plain() on it even once.

@chx makes a good point that a typical HTML page really only contains so many strings. Maybe this could work. It wouldn't have to be a global necessarily (it could be stored somewhere else... inside the request container, maybe?)

I did just think of a couple complications, though:

We're basically assuming that code which sanitizes strings will only call Drupal functions to do it, not PHP functions directly (or functions provided by some outside library). Is that OK? I guess this issue would only ever cause double-escaping, though (never cause a problem in the other direction), so perhaps it's OK to just say that's enough of an edge case where sites that do that will need to do {{ variable|raw }} when necessary, because we can't help them.

We're assuming we know how to identify that a string has been made safe. That's true for check_plain() and some other functions, but gets a little trickier with check_markup(), and even with filter_xss(). We're kind of trusting that people using those functions are using them correctly, I guess.

Yes, we trust people that stuff coming out of check_markup is safe. Our handle text in a secure fashion documentation page says explicitly "All you need to do is pass the rich text through check_markup() and you'll get HTML returned, safe for outputting." Similarly, if you add <embed> to the list of tags in filter_xss, hats off, I neither can or want to help you.

As for caches, no worries, consider that drupal_render or check_markup cache their own output which if known to be safe can be marked safe on cache retrieval as well. (Note that we already presume drupal_render returning safe strings by printing render($element) always without further escaping.) Ie. caching when handled inside a function does not need more special handling than adding it to the safe strings based on the same conditions as the normal return would be.

Benchmarks - Setup

Measuring Method

* 42 random nodes
* Testing: /node

with:

for i in {1..10}; do ab -n100 http://127.0.0.1/node/; done

All benchmarks done with the GLOBALS method outlined in #2 (because the SafeString objects method is way slower - even though the overhead is not directly visible in XHProf). Attributes and twig_render() calls still extend / return Twig_Markup if safe in these benchmarks.

Extra

Conclusion

* Twig in itself has overhead. Some of that comes from the Twig Engine directly; some is related to the DIC and double engine code (overall: 2.5-3% slower)
* The overhead of the template itself compared to node.tpl.php is marginal (0.74-1.1% slower)
* Autoescape costs additionally, the slowdown compared to Twig ranges from 1.45% - 4.36% depending on the used method
* Therefore in whole the overhead ranges from: 2.5-3% (without autoescape) to 4.5-8% (with autoescape)

Summary

TL; DR:

V1:

WT : +22,518 +6.2%
CPU : +24,002 +6.7%
MU : +1,732,448 +14.9%

V2:

WT : +16,569 +4.6%
CPU : +20,001 +5.6%
MU : +818,832 +7.1%

V3:

WT : +21,054 +5.8%
CPU : +20,002 +5.6%
MU : +861,808 +7.4%

For 42 nodes the time used by the approaches really does not differ much ... The memory is differing, but this could very well change to the maximum of 14.9% overhead, if all templates are converted to twig.

Curious about something, since I would really love to commit these Twig conversions that are kicking around, but haven't because of Dries's concern that it could leave core in an unreleasable state by July 1 if they're not all done *and* architecture issues like this aren't figured out (which are dependent on all the conversions getting done).

The main reason this issue is postponed on every single template file being converted is because it effectively removes the security layer from preprocess and makes it the theme engine responsible. How hard would it be to create a patch that did this paradigm shift for PHPTemplate, so they both behaved the same? So that we could hash out these really gnarly architectural issues *now* and not have to wait N months?

Talking this over with Moshe earlier today, he really views this as a "make it or break it" issue for Twig. Auto-security was one of the primary stated benefits we get from replacing our theme system, to help counter-balance other significant disadvantages such as the deployment challenges around adding much more compiled PHP. Escalating to Critical in response.

So essentially, whatever we can do to expedite proving this capability of Twig sooner, the better off we will be. It would be bad if it turns out we can't pull this off after all.

As far as I understand, it's quite possible to do this, but the question is how much of a performance and memory hit we're willing to take for it? (Probably this issue needs some more up-to-date benchmarks anyway.)

Talking this over with Moshe earlier today, he really views this as a "make it or break it" issue for Twig. Auto-security was one of the primary stated benefits we get from replacing our theme system, to help counter-balance other significant disadvantages such as the deployment challenges around adding much more compiled PHP.

I get where this is coming from (and to some extent even agree with it), but "primary stated benefits"... as far as I remember, this was discussed at length in some of the original issues and I thought it became pretty clear in that discussion that Twig itself provided no real security benefit for Drupal. Rather it was more like (exactly as stated in the issue summary here) if we can independently build a more secure theme system on the side, Twig will work very very cleanly with it.

In the original discussions, I think a lot of people felt like Twig had enough other benefits that even if there was no security improvement, it was worth doing for other reasons?

1) It's really really hard to throw random SQL into the template if it's a Twig file. By the time you coax it into working, you should have realized that it's hard because you're not supposed to do it. :-)
2) Twig natively handles escaping of variables automatically, and lets you specify in the template what kind of escaping to do. That's not a guarantee of correct filtering and secure output, but it's more reliable than "hope that someone got it right back in one of the 12 preprocess functions that fondled the variable before it got here".

Currently we're not using the second, AFAIK, because we're still doing escaping back in those 12 preprocess functions and don't want to double-escape everything. We want to switch over to Twig doing the escaping, but that's tricky to time.

I'm a bit confused here. With Twig now in core, D8's theme layer is no less secure than D7's theme layer sanitization-wise, and definitely more secure syntactically, i.e. Crell's first point in #23 (in addition to any other PHP).

So we are arguably better off already than D7, and D7 shipped. Which makes me want mark this closed (won't fix).

I brought up this point on the Twig call yesterday, but the whole "autoescape" notion seems like a unicorn to me. :) We are potentially introducing new APIs a month before code freeze in the name of a security "problem" that exists just as prominently in D7. So, if this is a "release blocker" for D8, does that arguably demand a critical SA for D7? You see my confusion. :)

Additionally, why not just provide support for the multiple types of sanitization functions. Why not instead of playing this auto-escape guessing game, we provide support Twig, ( example {{ foo|check_markup }} or {{ bar|check_plain }})?

To respond to #20 and #21, there are eight points listed in the "Pros" of #1499460: [meta] New theme system, one of which being "Improved security," some of which we get automatically in Crell's #23. Twig is good for many other reasons, and we are already better off security-wise than D7.

I think "closed (won't fix)" is a little extreme, but then again so is making this issue critical. If we can do autoescape (and it sounds like we can) then we should.

I'm oaky with allowing things like check_plain and check_markup as special filters (or functions?) for use in Twig templates, as long as theme devs don't need to use them.

And by "need" I mean that all of the drillable variables that core provides should be sanitized automatically, so when some poor front end dev tries to print out {{ content.field_link.0.text}} they don't need to understand the inner workings of Drupal, or guess blindly at which point they need to start sanitizing themselves.

If more advanced theme devs want to add their own stuff in preprocess to print out whatever, then I am okay with them needing to sanitize that themselves.

all of the drillable variables that core provides should be sanitized automatically

This may be better suited for #2008450: Provide for a drillable variable structure in Twig templates, but I agree that drillables should be safe. However, that simply connotes the drillable variables (which likely come from a preprocessor) just need to be sanitized, which (as of now) would be the job of the preprocessors' function body anyway. So I agree with your quoted statement above sans what the word "automatically" connotes; the theme preprocessor can been written to decide what vars it makes available and what it doesn't. If you write your own preprocessors or callbacks can be responsible for sanitizing your data, just as you were in prior versions of Drupal.