On Apr 20, 2010, at 6:43 PM, John Rose wrote:
> On Apr 20, 2010, at 4:38 PM, Tom Rodriguez wrote:
>>>> One partial cleanup would lead to some simplifications. It would change the convention by which the interpreter pops its outgoing arguments. Currently it pretends to keep the arguments on stack throughout the call; it should release them to the callee before completing the control transfer to the callee.
>>>> I'm not sure how this can be changed. The outgoing expression stack forms the base of the locals in the callee frame. The SP of the frame also has to cover all live values in the frame throughout its lifetime, otherwise a signal handler frame can be pushed on top of the current SP and overwrite live values.
>> For x86, on an outgoing call rsi and bp->last_sp would contain the popped sp value, but sp would continue to be the argument base. During the call [sp..rsi) would be the space holding the arguments being handed off from the caller to callee. Immediately out of the interpreter, the outgoing receiver argument would be reachable at [rsi-wordsize]; at present it's always awkward to calculate the receiver end of the argument list. For method handle adapters and maybe i2c adapters we'll want to use another register to remember the receiver end of the argument list, even after argument list reformatting.
Ignoring the details of how, what you're really wanting is that the unextended sp is the value that sp would have after the caller popped the outgoing arguments? So the call bytecode itself would compute how many arguments are popped instead of being done in the return entry point. It seems like something like that could be worked out though I'm unconvinced that it does more than shuffle around some complex code. I assume the benefit would come from changes to rsi/Gargs.
tom
>>> It's also possible to stop at the call site and GC before entering the callee and the arguments have to be live somewhere when this occurs.
>> A "resolve blob" like "handle_wrong_method" does that, because it sets the special 'caller_must_gc_arguments' bit. That's always handled as a special case, even today, so the special-case code would need to be adjusted to address the arguments using the different convention. (See 'map->include_argument_oops' in oops_interpreted_do, etc.)
>>> Also on sparc Lesp and SP are different registers but on intel they are the same which can complicate any changes in this area. Am I misunderstanding what you are suggesting needs to be changed?
>> For sparc, not much would change except the treatment of Lesp. SP(O6) and Gargs(G4) jointly serve to delimit the storage of outgoing arguments, and this wouldn't change (just as on x86 SP wouldn't be popped). The Lesp(L0) register would be cut back to mark the limit of the caller's stack, exclusive of arguments. The outgoing receiver would be reachable at [L0-wordSize] .
>>>> Currently, as a result, a return to the interpreter resets to a saved SP that points to the argument list base (lowest address, not highest), the address of an argument list that (at that point) is long gone. The interpreter must then perform extra calculations to re-adjust the SP by the size of the absent argument list, calculations which add no value. And when stack walking, we have to continually remember that the unextended SP might overlap with stack storage owned by the callee, which is counterintuitive and therefore bug-prone.
>>>> That's new with method handles. Previously frames could only be enlarged as a result of SP adjustments by the callee but the new argument shuffling logic means they can be shrunk too. The shrinking could be avoided if we moved all the arguments when shrinking was required. Again I don't see how this can be avoided unless you disallow frame size adjustment which is pretty much impossible I think.
>> For an adapter method handle, there are four interesting addresses: The two ends of the scratch memory we can use, and the two ends of the argument list. This boils down to three values on Intel, since two of those 'ends' are pointed at by rsp. On sparc it's a little trickier. The current rsi value passed out of the interpreter doesn't help locate any of those values; if we pre-pop arguments, then rsi becomes the far end of the scratch memory, as well as the initial far end of the argument list.
>> I haven't worked out the details on sparc, but I imagine they are similar, except for dealing with the duplicate stack pointers.
>> -- John