Comments

On 03/04/2013 03:19 PM, Laszlo Ersek wrote:
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>> ---
> +# @guest-set-vcpus:> +#> +# Attempt to reconfigure (currently: enable/disable) logical processors inside> +# the guest.> +#> +# The input list is processed node by node in order. In each node @logical-id> +# is used to look up the guest VCPU, for which @online specifies the requested> +# state. The set of distinct @logical-id's is only required to be a subset of> +# guest-supported identifiers. There's no restriction on list length or on> +# repeating the same @logical-id (with possibly different @online field).> +# Preferably the input list should describe a modified subset of> +# @guest-get-vcpus' return value.> +#> +# If part or whole of the requested operation can't be carried out, the guest> +# VCPU state will be unspecified.
Completely unspecified? Or is it guaranteed that a subsequent
successful guest-get-vcpus will still be reliably to learn after the
fact what happened? Would it make any more sense to have only a
guest-set-vcpu, which attempts to set the state of a single vcpu,
instead of an open-ended array of successive vcpu modifications in
guest-set-vcpus?
The interface seems relatively sane, though, and it looks like something
that libvirt would be able to use without having to add any new APIs
(just a new flag value to the existing virDomainSetVcpusFlags() function).

On 03/05/13 22:08, Eric Blake wrote:
> On 03/04/2013 03:19 PM, Laszlo Ersek wrote:>> Signed-off-by: Laszlo Ersek <lersek@redhat.com>>> ---> >> +# @guest-set-vcpus:>> +#>> +# Attempt to reconfigure (currently: enable/disable) logical processors inside>> +# the guest.>> +#>> +# The input list is processed node by node in order. In each node @logical-id>> +# is used to look up the guest VCPU, for which @online specifies the requested>> +# state. The set of distinct @logical-id's is only required to be a subset of>> +# guest-supported identifiers. There's no restriction on list length or on>> +# repeating the same @logical-id (with possibly different @online field).>> +# Preferably the input list should describe a modified subset of>> +# @guest-get-vcpus' return value.>> +#>> +# If part or whole of the requested operation can't be carried out, the guest>> +# VCPU state will be unspecified.> > Completely unspecified?
Yes. "Unspecified" means "valid" (ie. at least one VCPU will be online,
the guest won't be "dead"), but no further info will be returned at once.
> Or is it guaranteed that a subsequent> successful guest-get-vcpus will still be reliably to learn after the> fact what happened?
Yes, that is both the intent and implied by "unspecified" (as opposed to
"undefined").
> Would it make any more sense to have only a> guest-set-vcpu, which attempts to set the state of a single vcpu,> instead of an open-ended array of successive vcpu modifications in> guest-set-vcpus?
The current interface can be special-cased into that type of call,
however I wanted to provide a batch interface (flipping 100 VCPUs
shouldn't take 100 round trips).
> The interface seems relatively sane, though, and it looks like something> that libvirt would be able to use without having to add any new APIs> (just a new flag value to the existing virDomainSetVcpusFlags() function).
Oh.
virDomainSetVcpusFlags() [src/libvirt.c]
qemuDomainSetVcpusFlags() [src/qemu/qemu_driver.c]
qemuDomainHotplugVcpus()
qemuMonitorSetCPU() [src/qemu/qemu_monitor.c]
qemuMonitorTextSetCPU()
"cpu_set %d %s"
Does this work? I can't find any trace of the "cpu_set" (or the
"set_cpu") monitor command in upstream qemu.
The relevant libvirt commits are:
- e8d6c289 Support VCPU hotplug in QEMU guests
("NB, currently untested since QEMU segvs when running this!")
- a980d123 Fix CPU hotplug command names
If this works and I'm just not seeing something then I have no reason to
pursue this series.
... Ah I understand now. "cpu_set" *is* supported by the qemu-kvm
project at <git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git> -- and
by RHEL-6 qemu-kvm --, via ACPI.
I'll have to test this in RHEL-6. If it doesn't work, I should check
why. If it does, I'll have to figure out if I should continue to work on
this.
I wonder why <git://git.qemu.org/qemu.git> doesn't have "cpu_set".
Thanks
Laszlo

On 03/05/2013 04:05 PM, Laszlo Ersek wrote:
>> The interface seems relatively sane, though, and it looks like something>> that libvirt would be able to use without having to add any new APIs>> (just a new flag value to the existing virDomainSetVcpusFlags() function).> > Oh.> > virDomainSetVcpusFlags() [src/libvirt.c]> qemuDomainSetVcpusFlags() [src/qemu/qemu_driver.c]> qemuDomainHotplugVcpus()> qemuMonitorSetCPU() [src/qemu/qemu_monitor.c]> qemuMonitorTextSetCPU()> "cpu_set %d %s"> > Does this work? I can't find any trace of the "cpu_set" (or the> "set_cpu") monitor command in upstream qemu.>
The old cpu_set HMP command "worked" in something like qemu 0.10, and
was ripped out when we realized it didn't actually work in a way that
was guaranteed to be safe to the guest. Since then, the libvirt command
has been a guaranteed failure on qemu, although it continues to work on
xen (and since it has been several YEARS now of not working, people are
laughing at qemu for not getting cpu hotplug working when xen has had it
for so long).
Basically, libvirt would add a new flag that requests using the guest
agent command instead of the monitor command (supposing that we ever do
get around to having a working monitor command that uses ACPI cpu hot
unplug).
> The relevant libvirt commits are:> - e8d6c289 Support VCPU hotplug in QEMU guests> ("NB, currently untested since QEMU segvs when running this!")> - a980d123 Fix CPU hotplug command names> > If this works and I'm just not seeing something then I have no reason to> pursue this series.
No, it doesn't work because the HMP command was (intentionally) removed
several years ago when it was determined to be broken.
> > ... Ah I understand now. "cpu_set" *is* supported by the qemu-kvm> project at <git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git> -- and> by RHEL-6 qemu-kvm --, via ACPI.
Except that the ACPI approach didn't quite work, so qemu-kvm doesn't
expose that right now.
> > I'll have to test this in RHEL-6. If it doesn't work, I should check> why. If it does, I'll have to figure out if I should continue to work on> this.
Yes, PLEASE continue to work on this - having the guest agent as an
alternative to ACPI has proven useful in other respects (for example, we
wired up virDomainShutdownFlags() to let the user choose between
guest-agent, ACPI, or hypervisor choice).
> > I wonder why <git://git.qemu.org/qemu.git> doesn't have "cpu_set".
Because getting ACPI hotplug to work correctly has been harder than
anyone anticipated.

On 03/06/13 00:12, Eric Blake wrote:
> The old cpu_set HMP command "worked" in something like qemu 0.10, and> was ripped out when we realized it didn't actually work in a way that> was guaranteed to be safe to the guest. Since then, the libvirt command> has been a guaranteed failure on qemu, although it continues to work on> xen (and since it has been several YEARS now of not working, people are> laughing at qemu for not getting cpu hotplug working when xen has had it> for so long).
Under xen there's a separate comms method for requesting this (dom0 side
massaging of a specific node in xenstore + xenstore watch in the guest
kernel on that node).
http://wiki.xen.org/wiki/XenBus
http://wiki.xen.org/wiki/Event_Channel_Internals
>> I'll have to test this in RHEL-6. If it doesn't work, I should check>> why. If it does, I'll have to figure out if I should continue to work on>> this.> > Yes, PLEASE continue to work on this - having the guest agent as an> alternative to ACPI has proven useful in other respects (for example, we> wired up virDomainShutdownFlags() to let the user choose between> guest-agent, ACPI, or hypervisor choice).
OK.
Laszlo

----- Original Message -----
> On 03/05/13 22:08, Eric Blake wrote:> > On 03/04/2013 03:19 PM, Laszlo Ersek wrote:> >> Signed-off-by: Laszlo Ersek <lersek@redhat.com>> >> ---> > > >> +# @guest-set-vcpus:> >> +#> >> +# Attempt to reconfigure (currently: enable/disable) logical> >> processors inside> >> +# the guest.> >> +#> >> +# The input list is processed node by node in order. In each node> >> @logical-id> >> +# is used to look up the guest VCPU, for which @online specifies> >> the requested> >> +# state. The set of distinct @logical-id's is only required to be> >> a subset of> >> +# guest-supported identifiers. There's no restriction on list> >> length or on> >> +# repeating the same @logical-id (with possibly different @online> >> field).> >> +# Preferably the input list should describe a modified subset of> >> +# @guest-get-vcpus' return value.> >> +#> >> +# If part or whole of the requested operation can't be carried> >> out, the guest> >> +# VCPU state will be unspecified.> > > > Completely unspecified?> > Yes. "Unspecified" means "valid" (ie. at least one VCPU will be> online,> the guest won't be "dead"), but no further info will be returned at> once.> > > Or is it guaranteed that a subsequent> > successful guest-get-vcpus will still be reliably to learn after> > the> > fact what happened?> > Yes, that is both the intent and implied by "unspecified" (as opposed> to> "undefined").> > > Would it make any more sense to have only a> > guest-set-vcpu, which attempts to set the state of a single vcpu,> > instead of an open-ended array of successive vcpu modifications in> > guest-set-vcpus?> > The current interface can be special-cased into that type of call,> however I wanted to provide a batch interface (flipping 100 VCPUs> shouldn't take 100 round trips).> > > The interface seems relatively sane, though, and it looks like> > something> > that libvirt would be able to use without having to add any new> > APIs> > (just a new flag value to the existing virDomainSetVcpusFlags()> > function).> > Oh.> > virDomainSetVcpusFlags() [src/libvirt.c]> qemuDomainSetVcpusFlags() [src/qemu/qemu_driver.c]> qemuDomainHotplugVcpus()> qemuMonitorSetCPU() [src/qemu/qemu_monitor.c]> qemuMonitorTextSetCPU()> "cpu_set %d %s"> > Does this work? I can't find any trace of the "cpu_set" (or the> "set_cpu") monitor command in upstream qemu.> > The relevant libvirt commits are:> - e8d6c289 Support VCPU hotplug in QEMU guests> ("NB, currently untested since QEMU segvs when running this!")> - a980d123 Fix CPU hotplug command names> > If this works and I'm just not seeing something then I have no reason> to> pursue this series.> > ... Ah I understand now. "cpu_set" *is* supported by the qemu-kvm> project at <git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git> --> and> by RHEL-6 qemu-kvm --, via ACPI.> > I'll have to test this in RHEL-6. If it doesn't work, I should check> why. If it does, I'll have to figure out if I should continue to work> on> this.
cpu hotplug works for rhel6 and Igor is also pushing it to upstream
qemu now. But unplug doesn't. We need this alternative solution to
support both plug and unplug while unplug gets worked out.
> > I wonder why <git://git.qemu.org/qemu.git> doesn't have "cpu_set".> > Thanks> Laszlo>

On 03/05/2013 04:05 PM, Laszlo Ersek wrote:
>>> +# If part or whole of the requested operation can't be carried out, the guest>>> +# VCPU state will be unspecified.>>>> Completely unspecified?> > Yes. "Unspecified" means "valid" (ie. at least one VCPU will be online,> the guest won't be "dead"), but no further info will be returned at once.
Hmm, just thinking aloud here (not saying we need to swap interfaces,
unless you like this alternative):
What if we have guest-set-vcpus return a non-negative integer on
success; namely, the number of consecutive array actions that were
completed, and guarantee successful exit on first failure if any prior
element was acted on? Passing an empty array, or failing on the first
array element, would give an error; otherwise, the error is lost if a
user batches commands, but they would know how much of the batch failed,
and can retry the command with the failing entry first to see what the
failure was (assuming the failure is reproducible). Basically, this
would make guest-set-vcpus do partial write detection somewhat like write().

On 03/06/13 14:49, Eric Blake wrote:
> On 03/05/2013 04:05 PM, Laszlo Ersek wrote:>>>> +# If part or whole of the requested operation can't be carried out, the guest>>>> +# VCPU state will be unspecified.>>>>>> Completely unspecified?>>>> Yes. "Unspecified" means "valid" (ie. at least one VCPU will be online,>> the guest won't be "dead"), but no further info will be returned at once.> > Hmm, just thinking aloud here (not saying we need to swap interfaces,> unless you like this alternative):> > What if we have guest-set-vcpus return a non-negative integer on> success; namely, the number of consecutive array actions that were> completed, and guarantee successful exit on first failure if any prior> element was acted on? Passing an empty array, or failing on the first> array element, would give an error; otherwise, the error is lost if a> user batches commands, but they would know how much of the batch failed,> and can retry the command with the failing entry first to see what the> failure was (assuming the failure is reproducible). Basically, this> would make guest-set-vcpus do partial write detection somewhat like write().
You can sell me anything POSIX :)
Thanks!
Laszlo