Commit Message

The auto-ballooning feature automatically performs balloon inflate
or deflate based on host and guest memory pressure. This can help to
avoid swapping or worse in both, host and guest.
Auto-ballooning has a host and a guest part. The host performs
automatic inflate by requesting the guest to inflate its balloon
when the host is facing memory pressure. The guest performs
automatic deflate when it's facing memory pressure itself. It's
expected that auto-inflate and auto-deflate will balance each
other over time.
This commit implements the host side of auto-ballooning.
To be notified of host memory pressure, this commit makes use of this
kernel API proposal being discussed upstream:
http://marc.info/?l=linux-mm&m=135513372205134&w=2
Three new properties are added to the virtio-balloon device to activate
auto-ballooning:
o auto-balloon-mempressure-path: this is the path for the kernel's
mempressure cgroup notification dir, which must be already mounted
(see link above for details on this)
o auto-balloon-level: the memory pressure level to trigger auto-balloon.
Valid values are:
- low: the kernel is reclaiming memory for new allocations
- medium: some swapping activity has already started
- oom: the kernel will start playing russian roulette real soon
o auto-balloon-granularity: percentage of current guest memory by which
the balloon should be inflated. For example, a value of 1 corresponds
to 1% which means that a guest with 1G of memory will get its balloon
inflated to 10485K.
To test this, you need a kernel with the mempressure API patch applied and
the guest side of auto-ballooning.
Then the feature can be enabled like:
qemu [...] \
-balloon virtio,auto-balloon-mempressure-path=/sys/fs/cgroup/mempressure/,auto-balloon-level=low,auto-balloon-granularity=1
FIXMEs:
o rate-limit the event? Can receive several in a row
o add auto-balloon-maximum to limit the inflate?
o this shouldn't override balloon changes done by the user manually
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
---
hw/virtio-balloon.c | 156 ++++++++++++++++++++++++++++++++++++++++++++++++++++
hw/virtio-balloon.h | 4 ++
hw/virtio-pci.c | 5 ++
3 files changed, 165 insertions(+)

On Tue, 18 Dec 2012 14:53:30 -0800
Anton Vorontsov <anton.vorontsov@linaro.org> wrote:
> Hello Luiz,> > On Tue, Dec 18, 2012 at 06:16:55PM -0200, Luiz Capitulino wrote:> > The auto-ballooning feature automatically performs balloon inflate> > or deflate based on host and guest memory pressure. This can help to> > avoid swapping or worse in both, host and guest.> > > > Auto-ballooning has a host and a guest part. The host performs> > automatic inflate by requesting the guest to inflate its balloon> > when the host is facing memory pressure. The guest performs> > automatic deflate when it's facing memory pressure itself. It's> > expected that auto-inflate and auto-deflate will balance each> > other over time.> > > > This commit implements the host side of auto-ballooning.> > > > To be notified of host memory pressure, this commit makes use of this> > kernel API proposal being discussed upstream:> > > > http://marc.info/?l=linux-mm&m=135513372205134&w=2> > Wow, you're fast! And I'm glad that it works for you, so we have two> full-featured mempressure cgroup users already.
Thanks, although I think we need more testing to be sure this does what
we want. I mean, the basic mechanics does work, but my testing has been
very light so far.
> Even though it is a qemu patch, I think we should Cc linux-mm folks on it,> just to let them know the great news.
I'll do it next time.

> > Wow, you're fast! And I'm glad that it works for you, so we have two> > full-featured mempressure cgroup users already.> > Thanks, although I think we need more testing to be sure this does what we> want. I mean, the basic mechanics does work, but my testing has been very> light so far.
Is it possible to assign different weights for different VMs, something like the vmware 'shares' setting?

On Thu, 20 Dec 2012 05:24:12 +0000
Dietmar Maurer <dietmar@proxmox.com> wrote:
> > > Wow, you're fast! And I'm glad that it works for you, so we have two> > > full-featured mempressure cgroup users already.> > > > Thanks, although I think we need more testing to be sure this does what we> > want. I mean, the basic mechanics does work, but my testing has been very> > light so far.> > Is it possible to assign different weights for different VMs, something like the vmware 'shares' setting?
This series doesn't have the "weight" concept, it has auto-balloon-level and
auto-balloon-granularity. The former allows you to choose which type of
kernel low-mem level you want auto-inflate to trigger. The latter allows you
to say by how much the balloon should grow (as a percentage of the guest's
current memory).
Both of them are per VM.

Hi Luiz,
On (Tue) 18 Dec 2012 [18:16:55], Luiz Capitulino wrote:
> The auto-ballooning feature automatically performs balloon inflate> or deflate based on host and guest memory pressure. This can help to> avoid swapping or worse in both, host and guest.> > Auto-ballooning has a host and a guest part. The host performs> automatic inflate by requesting the guest to inflate its balloon> when the host is facing memory pressure. The guest performs> automatic deflate when it's facing memory pressure itself. It's> expected that auto-inflate and auto-deflate will balance each> other over time.
What does this last line mean?
> This commit implements the host side of auto-ballooning.> > To be notified of host memory pressure, this commit makes use of this> kernel API proposal being discussed upstream:> > http://marc.info/?l=linux-mm&m=135513372205134&w=2
We should wait till these patches are upstream. Also, an error
message better than "can't open file ..." to indicate a newer kernel
is needed for this feature?
> Three new properties are added to the virtio-balloon device to activate> auto-ballooning:> > o auto-balloon-mempressure-path: this is the path for the kernel's> mempressure cgroup notification dir, which must be already mounted> (see link above for details on this)> > o auto-balloon-level: the memory pressure level to trigger auto-balloon.> Valid values are:> > - low: the kernel is reclaiming memory for new allocations> - medium: some swapping activity has already started> - oom: the kernel will start playing russian roulette real soon> > o auto-balloon-granularity: percentage of current guest memory by which> the balloon should be inflated. For example, a value of 1 corresponds> to 1% which means that a guest with 1G of memory will get its balloon> inflated to 10485K.
This looks good. How about emitting a QMP message to notify
management of auto-ballooning?
> To test this, you need a kernel with the mempressure API patch applied and> the guest side of auto-ballooning.> > Then the feature can be enabled like:> > qemu [...] \> -balloon virtio,auto-balloon-mempressure-path=/sys/fs/cgroup/mempressure/,auto-balloon-level=low,auto-balloon-granularity=1> > FIXMEs:> > o rate-limit the event? Can receive several in a row
For this, I'm thinking the highest severity level should be picked to
act upon: e.g. if the following events are received in succession:
medium
oom
low
Then 'oom' is the highest level, and that should be acted upon
(i.e. we shouldn't deflate the balloon on getting the 'low'
notification above). The guest can always deflate the balloon when it
needs RAM.
Repeated 'low' notifications can be ignored, if one has been acted
upon already.
> o add auto-balloon-maximum to limit the inflate?
Yes, makes sense to add this.
> o this shouldn't override balloon changes done by the user manually
Can you think of examples here? If the user (host admin) has
ballooned an 8G guest down to 4G, auto-balloon will only further
shrink down the guest RAM, so there's no real 'overriding' happening
that I can think of. Of course, a guest can expand itself to, say,
5G, but that should be allowed as the guest might be under pressure.
Even in such a situation, the host has control by limiting the guest
using cgroups.
> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>> ---> hw/virtio-balloon.c | 156 ++++++++++++++++++++++++++++++++++++++++++++++++++++> hw/virtio-balloon.h | 4 ++> hw/virtio-pci.c | 5 ++> 3 files changed, 165 insertions(+)
Patch looks fine.
Amit

On Sat, 12 Jan 2013 02:02:32 +0530
Amit Shah <amit.shah@redhat.com> wrote:
> Hi Luiz,> > On (Tue) 18 Dec 2012 [18:16:55], Luiz Capitulino wrote:> > The auto-ballooning feature automatically performs balloon inflate> > or deflate based on host and guest memory pressure. This can help to> > avoid swapping or worse in both, host and guest.> > > > Auto-ballooning has a host and a guest part. The host performs> > automatic inflate by requesting the guest to inflate its balloon> > when the host is facing memory pressure. The guest performs> > automatic deflate when it's facing memory pressure itself. It's> > expected that auto-inflate and auto-deflate will balance each> > other over time.> > What does this last line mean?
When qemu does auto-inflate, the guest memory will be reduced. Then it's
expected that something will increase it again. That something is
auto-deflate.
However, if we deflate too much, than the host may face memory pressure,
and then auto-inflate will take place again.
It's expected that that sequence of auto-inflate and auto-deflate will
reach a balance in some point in time.
> > This commit implements the host side of auto-ballooning.> > > > To be notified of host memory pressure, this commit makes use of this> > kernel API proposal being discussed upstream:> > > > http://marc.info/?l=linux-mm&m=135513372205134&w=2> > We should wait till these patches are upstream.
Right.
> Also, an error> message better than "can't open file ..." to indicate a newer kernel> is needed for this feature?
Seems a good idea.
> > Three new properties are added to the virtio-balloon device to activate> > auto-ballooning:> > > > o auto-balloon-mempressure-path: this is the path for the kernel's> > mempressure cgroup notification dir, which must be already mounted> > (see link above for details on this)> > > > o auto-balloon-level: the memory pressure level to trigger auto-balloon.> > Valid values are:> > > > - low: the kernel is reclaiming memory for new allocations> > - medium: some swapping activity has already started> > - oom: the kernel will start playing russian roulette real soon> > > > o auto-balloon-granularity: percentage of current guest memory by which> > the balloon should be inflated. For example, a value of 1 corresponds> > to 1% which means that a guest with 1G of memory will get its balloon> > inflated to 10485K.> > This looks good.
Actually, for the next version I'll the user space shrinker API.
> How about emitting a QMP message to notify> management of auto-ballooning?
I could think about that if they are interested.
> > To test this, you need a kernel with the mempressure API patch applied and> > the guest side of auto-ballooning.> > > > Then the feature can be enabled like:> > > > qemu [...] \> > -balloon virtio,auto-balloon-mempressure-path=/sys/fs/cgroup/mempressure/,auto-balloon-level=low,auto-balloon-granularity=1> > > > FIXMEs:> > > > o rate-limit the event? Can receive several in a row> > For this, I'm thinking the highest severity level should be picked to> act upon: e.g. if the following events are received in succession:> > medium> oom> low> > Then 'oom' is the highest level, and that should be acted upon> (i.e. we shouldn't deflate the balloon on getting the 'low'> notification above). The guest can always deflate the balloon when it> needs RAM.> > Repeated 'low' notifications can be ignored, if one has been acted> upon already.
Makes sense. Although, as I said above I'll try the user-space shrinker API
for the next version.
> o add auto-balloon-maximum to limit the inflate?> > Yes, makes sense to add this.> > > o this shouldn't override balloon changes done by the user manually> > Can you think of examples here? If the user (host admin) has> ballooned an 8G guest down to 4G, auto-balloon will only further> shrink down the guest RAM, so there's no real 'overriding' happening> that I can think of. Of course, a guest can expand itself to, say,> 5G, but that should be allowed as the guest might be under pressure.
Yes, that's exactly my point above. I mean, taking your example, if
the user has ballooned an 8G down to 4G, should auto-balloon be allowed
to balloon to 5G or even back to 8G?
> Even in such a situation, the host has control by limiting the guest> using cgroups.> > > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>> > ---> > hw/virtio-balloon.c | 156 ++++++++++++++++++++++++++++++++++++++++++++++++++++> > hw/virtio-balloon.h | 4 ++> > hw/virtio-pci.c | 5 ++> > 3 files changed, 165 insertions(+)> > Patch looks fine.> > Amit>

On (Mon) 14 Jan 2013 [09:58:30], Luiz Capitulino wrote:
> On Sat, 12 Jan 2013 02:02:32 +0530> Amit Shah <amit.shah@redhat.com> wrote:> > > Hi Luiz,> > > > On (Tue) 18 Dec 2012 [18:16:55], Luiz Capitulino wrote:> > > The auto-ballooning feature automatically performs balloon inflate> > > or deflate based on host and guest memory pressure. This can help to> > > avoid swapping or worse in both, host and guest.> > > > > > Auto-ballooning has a host and a guest part. The host performs> > > automatic inflate by requesting the guest to inflate its balloon> > > when the host is facing memory pressure. The guest performs> > > automatic deflate when it's facing memory pressure itself. It's> > > expected that auto-inflate and auto-deflate will balance each> > > other over time.> > > > What does this last line mean?> > When qemu does auto-inflate, the guest memory will be reduced. Then it's> expected that something will increase it again. That something is> auto-deflate.> > However, if we deflate too much, than the host may face memory pressure,> and then auto-inflate will take place again.> > It's expected that that sequence of auto-inflate and auto-deflate will> reach a balance in some point in time.
This all depends on the loads on the systems and other external
factors; we never know how things might behave. Let's just leave this
line out.
> > How about emitting a QMP message to notify> > management of auto-ballooning?> > I could think about that if they are interested.
I think they would be - no harm in asking.
> > > o this shouldn't override balloon changes done by the user manually> > > > Can you think of examples here? If the user (host admin) has> > ballooned an 8G guest down to 4G, auto-balloon will only further> > shrink down the guest RAM, so there's no real 'overriding' happening> > that I can think of. Of course, a guest can expand itself to, say,> > 5G, but that should be allowed as the guest might be under pressure.> > Yes, that's exactly my point above. I mean, taking your example, if> the user has ballooned an 8G down to 4G, should auto-balloon be allowed> to balloon to 5G or even back to 8G?
I think it should be allowed by default, and tunable by
user/management (i.e. qemu shouldn't create policy, but enforce one if
provided).
Amit