On Dec 15, 2012, at 10:49 AM, David Laight wrote:
> On Sat, Dec 15, 2012 at 09:23:00AM -0800, Matt Thomas wrote:
>>
>> However, there is one disadvantage to using PCU.
>> If you want to use a PCU in the kernel (for faster copies for instance),
>> PCU isn't very accommodating.
>
> I've tried to read the patch, and I'm not entirely sure it is a good
> solution.
> Not least because, I think it requires multiple 'save areas' for
> each LWP.
>
> On amd64 the safe area we currently have for SSE2 is 512 bytes.
> Add support for the 256 AVX instructions and it increases to 832.
> You really don't want to be allocating multiple such saved areas
> (per lwp) on the off chance the kernel might want to use the registers.
Since this is MD, you only need to save the register your kernel MD code will
be using.
> What almost works is to make the current lwp the resource owner,
> then require the kernel code to save and restore any registers it
> modifies.
> There are several problems though:
> 1) gdb (etc) will find the wrong registers for the lwp.
Not really since you can't return to user land before surrendering the PCU.
> 2) Partial state is hard to save (I think i387 is very nasty here).
Depends on the platform.
> 3) Saving and restoring a register may zero the high bits of an
> extended version of that register.
That's an md problem.
> An alternative would be to make the code that wants to use the
> PCU provide the save area (or probably a place where a malloced
> pointer to the save area will be written).
There's no reason why you can't "nest" save-areas and pop them
as needed. (One save points to the one it's stacked, etc.)
> You'd then need to save the processes user-space state in the lwp
> and then put a pointer the temporary save area into the lwp.
> Process switches would then cause the state to be saved in the
> temporary location.
Typically it's in the PCB or a pointer from PCB, not lwp itself.
> When finished everything can be unwound and the PCU marked 'available'.
True.
> This all means that the you'd have to be doing a significant amount
> of work with the PCU in order to make it all worthwhile.
> As well as the save/restore, things like changing the mode of the PCU
> can take a lot of clocks - I read 170 to switch from SSE2 to AVX).
For now, this is intended for ARM on which most LWPs don't use the PCU so the
overhead of saving/restoring can be bypassed.