Description:
------------
This bug was fixed in 7.2.0-beta3, originally found on 7.2.0-beta1, and affecting beta1 and beta2. Adding it here for documenting purposes.
Problem lies within ext/json/json_scanner.c file. Variable utf8_invalid_count is used for counting the delta of the string len due to the invalid utf8 characters ignore or substitution:
int utf8_addition = (s->options & PHP_JSON_INVALID_UTF8_SUBSTITUTE) ? 3 : 0;
s->utf8_invalid = 1;
s->utf8_invalid_count += utf8_addition - 1;
PHP_JSON_CONDITION_GOTO(STR_P1);
As you can see, utf8_invalid_count will decrease by 1 for each invalid byte using _IGNORE option and increment by 2 using _SUBSTITUTE option. This is further used with allocating the string taken from JSON:
size_t len = s->cursor - s->str_start - s->str_esc - 1 + s->utf8_invalid_count;
The problem ( at least it seems ) is that utf8_invalid_count is not reinitialized to zero when entering php_json_scan. Hence, when allocating another string, the buffer will be too small due to utf8_invalid_count decrementing the len.
The size of the overflow is controllable by attack and it equals O = (L2 – L1) where L2 = length of second string allocated and L1 = amount of invalid UTF8 bytes injected into first string.
Following POC should reproduce the attack ( Im writing this off my head as I have the real POC on my other laptop, in case it’s not proper I’m going to send the proper POC tomorrow ):
<?php
$text = '"'.chr(0xC1).chr(0xC1).'""DDD"';
var_dump(json_decode($text,false,512,JSON_INVALID_UTF8_IGNORE));
What will happen is that the first string ( "#C1#C1" ) will be properly stripped of two invalid bytes and returned, however the second string will get allocated too small due to utf8_invalid_count = -2 , hence len instead of being 3 will become 1.
Then, inside the php_json_scanner_copy_string routine, buffer overflow will happen.
_SUBSTITUTE option , while less convenient, can also be used for exploiting as to overflow the sign bit ( utf8_invalid_count is a 32 bit signed int ) we need to send around 1.1 GB of invalid bytes, and around 2.2 GB to make the counter be feasible for attack.
Fortunately, the feature was recently introduced and is not adopted wide, but we should definitely act quickly to make sure the bug doesn’t make it to 7.2.0 as there’s big potential for attack surface once adoption takes place.
Test script:
---------------
<?php
$text = '"'.chr(0xC1).chr(0xC1).'""DDD"';
var_dump(json_decode($text,false,512,JSON_INVALID_UTF8_IGNORE));
Expected result:
----------------
Deserialized json object
Actual result:
--------------
Crash

History

Thanks for reporting this! According to our security issue
classification this issue is a low severity issue, and since it
already has been fixed, there is no need for a private report.
Anyhow, the bug fix has to be documented in NEWS and the
changelog, and a respective regression test should be added.

> Im writing this off my head as I have the real POC on my other
> laptop, in case it’s not proper I’m going to send the proper POC
> tomorrow
Yes, please. While I can confirm issues regarding an
uninititalized variable in beta1 and beta2, I cannot reproduce the
segfault, and actually $text looks strange, and might have been:
'["'.chr(0xC1).chr(0xC1).'", "DDD"]'

> The test also cover the segfault (it won't work for your example as you have 3
> bytes after - you need to have less to make it segfault - see the test)
Ah, that explains the issue. So since we're having a proper test already, no
need to add another one.
NEWS updated via <http://git.php.net/?p=php-src.git;a=commit;h=613bac9>.
Actually, there's no need to update the Changelog separately, because there is
none yet for PHP 7.2.