Given the the only call to java_lang_String::set_debug_intrinsics is
within an ifdef, shouldn't the declaration and definition of the method
also be guarded the same way?

> This crash is caused by missing array bounds checks on compact string
> intrinsics. It shows up when unsynchronized access to a StringBuilder
> object causes inconsistent field values.
>
> To convince myself that all the necessary bounds checks are being done,
> I put callers into two groups, trusted and untrusted. Untrusted callers
> are all directed through StringUTF16 methods, so that bounds checks are
> done in one place and can be tested easily. Trusted callers bypass the
> bounds checks, so they must do their own checking.

The changes to the JDK core classes are quite extensive. This will need
rigorous functional and performance testing and it is very late in the
release cycle to make these kinds of changes. But I'll leave that to the
core-libs folk to comment on.

David
-----

> As a safety net, I added asserts around the intrinsic calls, and a
> try/catch that so any out of bounds exception turns into an assert error
> as well. Finally, I restored some C2 debug code that was previously
> removed, and I use it to do bounds checking in debug builds. In a
> product build C2 will remove all of these.
>
> See the bug report for tests run.
>
> There are some unavoidable performance regressions on micro benchmarks,
> because now we are doing bounds checks that we weren't before.
>
> dl
>

> The changes to the JDK core classes are quite extensive. This will need
> rigorous functional and performance testing and it is very late in the
> release cycle to make these kinds of changes. But I'll leave that to the
> core-libs folk to comment on.

I have the same concern. Can we fix the immediate problem in 9 and
integrate verification logic in 10?

> src/share/vm/classfile/javaClasses.hpp
>
> Given the the only call to java_lang_String::set_debug_intrinsics is
> within an ifdef, shouldn't the declaration and definition of the
> method also be guarded the same way?

On 3/16/17 2:52 AM, Tobias Hartmann wrote:
>> As a safety net, I added asserts around the intrinsic calls, and a try/catch that so any out of bounds exception turns into an assert error as well.
> So the assert and try/catch are only necessary to catch invalid offsets passed to the C1 intrinsic, right? Interpreted code is safe and the C2 intrinsics have additional guards in debug builds.
>
>
> I'm fine with that but an alternative would be to also have such guards in the C1 intrinsic. For example, one could enable the normal bound checks and add a simple check to Runtime1::throw_range_check_exception() that fails if the throwing method is the C1 intrinsic. Like this, we could avoid the assert, try-catch and DEBUG_INTRINSICS code.
>

On 3/16/17 11:09 AM, Vladimir Ivanov wrote:

>
>> The changes to the JDK core classes are quite extensive. This will need
>> rigorous functional and performance testing and it is very late in the
>> release cycle to make these kinds of changes. But I'll leave that to the
>> core-libs folk to comment on.
>
> I have the same concern. Can we fix the immediate problem in 9 and
> integrate verification logic in 10?
>

OK, Tobias is suggesting having verification logic only inside the
intrinsics. Are you suggesting removing that as well?

I'm OK with removing all the verification, but that won't reduce the
library changes much. I could undo the renaming to Trusted.getChar, but
we would still have the bounds checks moved into StringUTF16.

>> I have the same concern. Can we fix the immediate problem in 9 and
>> integrate verification logic in 10?
>>
>
> OK, Tobias is suggesting having verification logic only inside the
> intrinsics. Are you suggesting removing that as well?

Yes and put them back in 10.

> I'm OK with removing all the verification, but that won't reduce the
> library changes much. I could undo the renaming to Trusted.getChar, but
> we would still have the bounds checks moved into StringUTF16.

I suggest to go with a point fix for 9: just add missing range checks.

Is AbstractStringBuilder.append() the only affected method? (Sorry, it's
hard to say exactly where the problem is by looking at the diff.)

I really like the refactoring you propose on jdk side, but there are
pieces I'm not sure about. For example, I spotted a repeated range check:

>
>>> I have the same concern. Can we fix the immediate problem in 9 and
>>> integrate verification logic in 10?
>>>
>>
>> OK, Tobias is suggesting having verification logic only inside the
>> intrinsics. Are you suggesting removing that as well?
>
> Yes and put them back in 10.

OK.

>
>> I'm OK with removing all the verification, but that won't reduce the
>> library changes much. I could undo the renaming to Trusted.getChar, but
>> we would still have the bounds checks moved into StringUTF16.
>
> I suggest to go with a point fix for 9: just add missing range checks.
>
> Is AbstractStringBuilder.append() the only affected method? (Sorry,
> it's hard to say exactly where the problem is by looking at the diff.)
>

In the failing test, yes, it was append, but when I went to fix the
problem I found that it was much more wide spread, and there were
several methods that were affected.

OK, I did not look for redundant checks. This check is actually not
redundant. The "value" array may be oversized, so "count" is supposed
to contain the current maximum. For the safety of the intrinsic array
access, we check against the array length, but for the API we need to be
stricter and check against the character count.

However, the checkIndex() here is a good example of what is wrong. Let's
say we were checking against value.length instead of "count". Even if
checkIndex() succeeds here, based on the current length of "value", we
can't trust it because the object is mutable and "value" can change
between the checkIndex() and putCharSB().

>
>>> I have the same concern. Can we fix the immediate problem in 9 and
>>> integrate verification logic in 10?
>>>
>>
>> OK, Tobias is suggesting having verification logic only inside the
>> intrinsics. Are you suggesting removing that as well?
>
> Yes and put them back in 10.
>
>> I'm OK with removing all the verification, but that won't reduce the
>> library changes much. I could undo the renaming to Trusted.getChar, but
>> we would still have the bounds checks moved into StringUTF16.
>
> I suggest to go with a point fix for 9: just add missing range checks.

Though as I suggested last time personally I prefer to make minimum
change to simply
seal those holes in ASB at this late stage of JDK9. I'm fine with the
webrev.2 and it looks
better and reasonable to pull all UTF16 operations into StringUTF16.java.

Just for my curiosity, does the change in String#1810 make difference?

> I posted two new versions, webrev.1 keeping the Trusted inner class:
>
> http://cr.openjdk.java.net/~dlong/8158168/webrev.1/>
> and webrev.2 with it removed:
>
> http://cr.openjdk.java.net/~dlong/8158168/webrev.2/>
> dl
>
> On 3/17/17 5:58 AM, Vladimir Ivanov wrote:
>>
>>>> I have the same concern. Can we fix the immediate problem in 9 and
>>>> integrate verification logic in 10?
>>>>
>>>
>>> OK, Tobias is suggesting having verification logic only inside the
>>> intrinsics. Are you suggesting removing that as well?
>>
>> Yes and put them back in 10.
>>
>>> I'm OK with removing all the verification, but that won't reduce the
>>> library changes much. I could undo the renaming to Trusted.getChar,
>>> but
>>> we would still have the bounds checks moved into StringUTF16.
>>
>> I suggest to go with a point fix for 9: just add missing range checks.
>

> Hi Dean,
>
> Thanks for doing this.
>
> Though as I suggested last time personally I prefer to make minimum
> change to simply
> seal those holes in ASB at this late stage of JDK9. I'm fine with the
> webrev.2 and it looks
> better and reasonable to pull all UTF16 operations into StringUTF16.java.
>
> Just for my curiosity, does the change in String#1810 make difference?
>

Yes, it's in the spirit of the comment "return immediately where
possible", and allows me to have

474 assert fromIndex >= 0;

in StringUTF16.lastIndexOf(). This is because fromIndex can go negative
at String#1808.

Thanks for the review.

dl

> Thanks!
> Sherman
>
>
>
> On 3/17/17, 3:07 PM, [hidden email] wrote:
>> I posted two new versions, webrev.1 keeping the Trusted inner class:
>>
>> http://cr.openjdk.java.net/~dlong/8158168/webrev.1/>>
>> and webrev.2 with it removed:
>>
>> http://cr.openjdk.java.net/~dlong/8158168/webrev.2/>>
>> dl
>>
>> On 3/17/17 5:58 AM, Vladimir Ivanov wrote:
>>>
>>>>> I have the same concern. Can we fix the immediate problem in 9 and
>>>>> integrate verification logic in 10?
>>>>>
>>>>
>>>> OK, Tobias is suggesting having verification logic only inside the
>>>> intrinsics. Are you suggesting removing that as well?
>>>
>>> Yes and put them back in 10.
>>>
>>>> I'm OK with removing all the verification, but that won't reduce the
>>>> library changes much. I could undo the renaming to
>>>> Trusted.getChar, but
>>>> we would still have the bounds checks moved into StringUTF16.
>>>
>>> I suggest to go with a point fix for 9: just add missing range checks.
>>
>

The bounds check is needed only in String.nonSyncContentEquals when it
extracts info from AbstractStringBuilder. I don't see how out of bounds
access can happen in String.contentEquals:
if (n != length()) {
return false;
}
...
for (int i = 0; i < n; i++) {
if (StringUTF16.getChar(val, i) != cs.charAt(i)) {

I think bounds checks in StringConcatHelper.prepend() are skipped
intentionally, since java.lang.invoke.StringConcatFactory constructs
method handle chains which already contain bounds checks: array length
is precomputed based on argument values and all accesses are guaranteed
to be in bounds.

============
I moved bounds checks from StringUTF16.lastIndexOf/indexOf to
ABS.indexOf/lastIndexOf. I think it's enough to do range check on
ABS.value & ABS.count. After that, all accesses should be inbounds by
construction (in String.indexOf/lastIndexOf):

> On 3/17/17 5:58 AM, Vladimir Ivanov wrote:
>>
>>>> I have the same concern. Can we fix the immediate problem in 9 and
>>>> integrate verification logic in 10?
>>>>
>>>
>>> OK, Tobias is suggesting having verification logic only inside the
>>> intrinsics. Are you suggesting removing that as well?
>>
>> Yes and put them back in 10.
>>
>>> I'm OK with removing all the verification, but that won't reduce the
>>> library changes much. I could undo the renaming to Trusted.getChar, but
>>> we would still have the bounds checks moved into StringUTF16.
>>
>> I suggest to go with a point fix for 9: just add missing range checks.
>

Thanks. The reason I didn't go with that approach from the beginning is
because I couldn't convince myself that I could find all the missing
bounds checks, and I wanted an interface to test against. With the
bounds checks in AbstractStringBuilder, it is very hard to test all the
possible race conditions, because some of the race conditions only
happen when an ASB field changes half-way through the method.

> Best regards,
> Vladimir Ivanov
>
>> On 3/17/17 5:58 AM, Vladimir Ivanov wrote:
>>>
>>>>> I have the same concern. Can we fix the immediate problem in 9 and
>>>>> integrate verification logic in 10?
>>>>>
>>>>
>>>> OK, Tobias is suggesting having verification logic only inside the
>>>> intrinsics. Are you suggesting removing that as well?
>>>
>>> Yes and put them back in 10.
>>>
>>>> I'm OK with removing all the verification, but that won't reduce the
>>>> library changes much. I could undo the renaming to
>>>> Trusted.getChar, but
>>>> we would still have the bounds checks moved into StringUTF16.
>>>
>>> I suggest to go with a point fix for 9: just add missing range checks.
>>

> On 3/21/17 9:37 AM, Vladimir Ivanov wrote:
>
>>> and webrev.2 with it removed:
>>>
>>> http://cr.openjdk.java.net/~dlong/8158168/webrev.2/>>
>> Thanks, Dean. I started with webrev.2 and tried to minimize the
>> changes. I ended up with the following version:
>>
>> http://cr.openjdk.java.net/~vlivanov/dlong/8158168/webrev.00/>>
>
> Thanks. The reason I didn't go with that approach from the beginning is
> because I couldn't convince myself that I could find all the missing
> bounds checks, and I wanted an interface to test against. With the
> bounds checks in AbstractStringBuilder, it is very hard to test all the
> possible race conditions, because some of the race conditions only
> happen when an ASB field changes half-way through the method.

So are we convinced that the proposed changes will never lead to a crash
due to a missing or incorrect bounds check, due to a racy use of an
unsynchronized ASB instance e.g. StringBuilder?

> I haven't been looking at the details of this but have been watching
> from afar. As per my comments in the bug report (now public) I'm quite
> concerned about the thread-non-safety issue here ...
>
> On 22/03/2017 4:47 AM, [hidden email] wrote:
>> On 3/21/17 9:37 AM, Vladimir Ivanov wrote:
>>
>>>> and webrev.2 with it removed:
>>>>
>>>> http://cr.openjdk.java.net/~dlong/8158168/webrev.2/>>>
>>> Thanks, Dean. I started with webrev.2 and tried to minimize the
>>> changes. I ended up with the following version:
>>>
>>> http://cr.openjdk.java.net/~vlivanov/dlong/8158168/webrev.00/>>>
>>
>> Thanks. The reason I didn't go with that approach from the beginning is
>> because I couldn't convince myself that I could find all the missing
>> bounds checks, and I wanted an interface to test against. With the
>> bounds checks in AbstractStringBuilder, it is very hard to test all the
>> possible race conditions, because some of the race conditions only
>> happen when an ASB field changes half-way through the method.
>
> So are we convinced that the proposed changes will never lead to a
> crash due to a missing or incorrect bounds check, due to a racy use of
> an unsynchronized ASB instance e.g. StringBuilder?
>

If only we had a static analysis tool that could tell us if the code is
safe. Because we don't, in my initial changeset, we always take a
snapshot of the ASB fields by passing those field values to StringUTF16
before doing checks on them. And I wrote a test to make sure that those
StringUTF16 interfaces are catching all the underflows and overflows I
could imagine, and I added verification code to detect when a check was
missed.

However, all the reviewers have requested to minimize the amount of
changes. In Vladimir's version, if there is a missing check somewhere,
then yes it could lead to a crash.

> On 3/21/17 5:02 PM, David Holmes wrote:
>
>> I haven't been looking at the details of this but have been watching
>> from afar. As per my comments in the bug report (now public) I'm quite
>> concerned about the thread-non-safety issue here ...
>>
>> On 22/03/2017 4:47 AM, [hidden email] wrote:
>>> On 3/21/17 9:37 AM, Vladimir Ivanov wrote:
>>>
>>>>> and webrev.2 with it removed:
>>>>>
>>>>> http://cr.openjdk.java.net/~dlong/8158168/webrev.2/>>>>
>>>> Thanks, Dean. I started with webrev.2 and tried to minimize the
>>>> changes. I ended up with the following version:
>>>>
>>>> http://cr.openjdk.java.net/~vlivanov/dlong/8158168/webrev.00/>>>>
>>>
>>> Thanks. The reason I didn't go with that approach from the beginning is
>>> because I couldn't convince myself that I could find all the missing
>>> bounds checks, and I wanted an interface to test against. With the
>>> bounds checks in AbstractStringBuilder, it is very hard to test all the
>>> possible race conditions, because some of the race conditions only
>>> happen when an ASB field changes half-way through the method.
>>
>> So are we convinced that the proposed changes will never lead to a
>> crash due to a missing or incorrect bounds check, due to a racy use of
>> an unsynchronized ASB instance e.g. StringBuilder?
>>
>
> If only we had a static analysis tool that could tell us if the code is
> safe. Because we don't, in my initial changeset, we always take a
> snapshot of the ASB fields by passing those field values to StringUTF16
> before doing checks on them. And I wrote a test to make sure that those
> StringUTF16 interfaces are catching all the underflows and overflows I
> could imagine, and I added verification code to detect when a check was
> missed.
>
> However, all the reviewers have requested to minimize the amount of
> changes. In Vladimir's version, if there is a missing check somewhere,
> then yes it could lead to a crash.

I wonder if the reviewers have fully realized the potential impact here?
This has exposed a flaw in the way intrinsics are used from core classes.

On 22.03.2017 06:09, David Holmes wrote:
> I wonder if the reviewers have fully realized the potential impact here? This has exposed a flaw in the way intrinsics are used from core classes.

For the record, I'm fine with both the initial fix as well as the simplified version. If we don't add the debug runtime checks in the intrinsics, we should definitely add them with JDK 10 as additional safety net (we should consider other intrinsics and C1 as well).

>>> So are we convinced that the proposed changes will never lead to a
>>> crash due to a missing or incorrect bounds check, due to a racy use of
>>> an unsynchronized ASB instance e.g. StringBuilder?
>>
>> If only we had a static analysis tool that could tell us if the code is
>> safe. Because we don't, in my initial changeset, we always take a
>> snapshot of the ASB fields by passing those field values to StringUTF16
>> before doing checks on them. And I wrote a test to make sure that those
>> StringUTF16 interfaces are catching all the underflows and overflows I
>> could imagine, and I added verification code to detect when a check was
>> missed.
>>
>> However, all the reviewers have requested to minimize the amount of
>> changes. In Vladimir's version, if there is a missing check somewhere,
>> then yes it could lead to a crash.

I'd like to point out that asserts and verification code are disabled by
default. They are invaluable during problem diagnosis, but don't help at
all from defence-in-depth perspective.

But I agree that it's easier to reason about and test the initial
version of the fix.

> I wonder if the reviewers have fully realized the potential impact here?
> This has exposed a flaw in the way intrinsics are used from core classes.

FTR here are the checks I omitted in the minimized version (modulo
separation of indexOf/lastIndexOf for trusted/non-trusted callers):

Other than that, the difference is mainly about undoing refactorings and
removing verification logic (asserts + checks in the JVM).

There are still unsafe accesses which are considered safe in both
versions (see StringUTF16.Trusted usages in the initial version [1]).

We used to provide safe wrappers for unsafe intrinsics which makes it
much easier to reason about code correctness. I'd like to see compact
string code refactored that way and IMO the initial version by Dean is a
big step in the right direction.

I still prefer to see a point fix in 9 and major refactoring happening
in 10, but I'll leave the decision on how to proceed with the fix to
core-libs folks. After finishing the exercise minimizing the fix, I'm
much more comfortable with the initial fix [1] (though there are changes
I consider excessive).

>> Also, it looks like the changes I made to ASB.appendChars(char[] s, int
>> off, int end) are not needed.
>
> Agree.
>
>>> Vladimir, don't you need to replace checkIndex with checkOffset in
>>> indexOf and lastIndexOf, so that we allow count == length?
>
> Yes, my bad. Good catch. Updated webrev in place.
>
> FTR I haven't done any extensive testing of the minimized fix.
>
> If we agree to proceed with it, the regression test should be updated
> as well. I think the viable solution would be to construct broken SBs
> (using reflection) and invoke affected methods on them.
>

We can construct broken SBs using the Helper class that gets patched
into java.lang. I'll work on that.