Comments

On Tue, Mar 30, 2010 at 09:04:28PM +0300, Blue Swirl wrote:
> On 3/25/10, Juergen Lock <nox@jelal.kn-bremen.de> wrote:> > Hi!> >> > Now that qemu git head works again (thanx Aurelien! :) I've finished> > the FreeBSD qemu-devel port update patch/shar that made me uncover> > the bug:> > http://people.freebsd.org/~nox/qemu/qemu-devel-20100323.patch> > resp.> > http://people.freebsd.org/~nox/qemu/qemu-devel-20100323.shar> >> > This also adds a few misc fixes (that I'll submit on the qemu list> > seperately), I have...> >> > . Fixed the FreeBSD executable path detection to work without /proc> > mounted (it usually isn't on FreeBSD), so you now no longer have to> > pass the path to the pc-bios dir with -L if you run qemu out of the> > build dir when another version is installed, like,> > work/qemu-snapshot-20100323_20/i386-softmmu/qemu ...> >> > (files/patch-vl.c in the shar/patch)> >> > . Fixed some more bsd-user bugs so all of i386-bsd-user, x86_64-bsd-user,> > and sparc64-bsd-user now run for me again on FreeBSD stable/8 amd64.> > (I didn't test sparc-bsd-user as I only tried -bsd freebsd and FreeBSD> > doesn't run on 32bit sparc.) - Yes bsd-user still needs more work but> > at least simple exectuables run.> >> > (files/patch-bsd-user-mmap.c, files/patch-exec.c)> >> > . Fixed the bsd-user host page protection code for FreeBSD hosts> > (using kinfo_getvmmap(3) on FeeBSD >= 7.x and /compat/linux/proc> > on older FreeBSD.)> >> > (files/patch-bsd-user-linproc)> >> > . Fixed some compilation warnings and a missing #include.> >> > (files/patch-qemu-char.c, files/patch-qemu-timer.c)> >> > Thanks, applied all except exec.c one.
Oh, is there something wrong with it? You mean this one, right?
Subject: [PATCH] Avoid page_set_flags() assert in qemu-user host page
protection code
Message-ID: <20100325211421.GA52572@triton8.kn-bremen.de>
[...]
I first tried to replace the endaddr in the !h2g_valid(endaddr) case with
((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS) - 1
if TARGET_ABI_BITS > L1_MAP_ADDR_SPACE_BITS (which comes from the condition
of the assert in page_set_flags() that was triggered on the ~0ul value),
but that caused the qemu process to grow into swap and made the box
usuable when that code was reached and I had to kill qemu. (The box has
8 GB RAM.) And so I thought just leaving that page range unprotected
if only the start address is valid was the lesser evil...
..and thanx for committing the other ones, :)
Juergen

On 03/30/2010 12:16 PM, Juergen Lock wrote:
> I first tried to replace the endaddr in the !h2g_valid(endaddr) case with> ((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS) - 1> if TARGET_ABI_BITS > L1_MAP_ADDR_SPACE_BITS (which comes from the condition> of the assert in page_set_flags() that was triggered on the ~0ul value),> but that caused the qemu process to grow into swap and made the box> usuable when that code was reached and I had to kill qemu. (The box has> 8 GB RAM.) And so I thought just leaving that page range unprotected> if only the start address is valid was the lesser evil...
What's are the real arguments to the page_set_flags that causes things
to go into swap? I can't imagine the range really being so large that
it causes massive allocation within that function...
r~

In article <4BB2540B.90704@twiddle.net> you write:
>On 03/30/2010 12:16 PM, Juergen Lock wrote:>> I first tried to replace the endaddr in the !h2g_valid(endaddr) case with>> ((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS) - 1>> if TARGET_ABI_BITS > L1_MAP_ADDR_SPACE_BITS (which comes from the condition>> of the assert in page_set_flags() that was triggered on the ~0ul value),>> but that caused the qemu process to grow into swap and made the box>> usuable when that code was reached and I had to kill qemu. (The box has>> 8 GB RAM.) And so I thought just leaving that page range unprotected>> if only the start address is valid was the lesser evil...>>What's are the real arguments to the page_set_flags that causes things>to go into swap? I can't imagine the range really being so large that>it causes massive allocation within that function...
Oh sorry if that was not clear, things go into swap if I _replace_ the
endaddr ~0ul (which caused the assert) with the max value the assert
still tolerates i.e.
((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS) - 1
which in this case seems to be 0x7fffffffffff:
#3 0x0000000060012731 in page_set_flags (start=140737488224256,
end=18446744073709551615, flags=32)
at /usr/ports/emulators/qemu-devel-20100323a/work/qemu-snapshot-20100323_20/exec.c:2426
2426 assert(end < ((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS));
(gdb) i li 2426
Line 2426 of "/usr/ports/emulators/qemu-devel-20100323a/work/qemu-snapshot-20100323_20/exec.c" starts at address 0x60012662 <page_set_flags+34>
and ends at 0x60012675 <page_set_flags+53>.
(gdb) disassemble 0x60012662 0x60012675
Dump of assembler code from 0x60012662 to 0x60012675:
0x0000000060012662 <page_set_flags+34>: mov $0x7fffffffffff,%rax
^^^^^^^^^^^^^^
0x000000006001266c <page_set_flags+44>: cmp %rax,%rsi
0x000000006001266f <page_set_flags+47>: ja 0x60012718 <page_set_flags+216>
End of assembler dump.
(gdb) q
Cheers,
Juergen

On 03/30/2010 01:09 PM, Juergen Lock wrote:
> Oh sorry if that was not clear, things go into swap if I _replace_ the> endaddr ~0ul (which caused the assert) with the max value the assert> still tolerates i.e.> ((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS) - 1> which in this case seems to be 0x7fffffffffff:
Yes, I got that. And I see from ...
> #3 0x0000000060012731 in page_set_flags (start=140737488224256, > end=18446744073709551615, flags=32)
... here that the range we're reserving is
0x7ffffffe0000 ... 0x7fffffffffff
which is a mere 128k range. Which ought to allocate no more than
a single leaf page table (and thus N-1 pages for the N-level table).
Which doesn't answer the question of why you'd wind up running out
of memory.
r~

On Tue, Mar 30, 2010 at 10:09:47PM +0200, Juergen Lock wrote:
> In article <4BB2540B.90704@twiddle.net> you write:> >On 03/30/2010 12:16 PM, Juergen Lock wrote:> >> I first tried to replace the endaddr in the !h2g_valid(endaddr) case with> >> ((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS) - 1> >> if TARGET_ABI_BITS > L1_MAP_ADDR_SPACE_BITS (which comes from the condition> >> of the assert in page_set_flags() that was triggered on the ~0ul value),> >> but that caused the qemu process to grow into swap and made the box> >> usuable when that code was reached and I had to kill qemu. (The box has> >> 8 GB RAM.) And so I thought just leaving that page range unprotected> >> if only the start address is valid was the lesser evil...> >> >What's are the real arguments to the page_set_flags that causes things> >to go into swap? I can't imagine the range really being so large that> >it causes massive allocation within that function...> > Oh sorry if that was not clear, things go into swap if I _replace_ the> endaddr ~0ul (which caused the assert) with the max value the assert> still tolerates i.e.> ((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS) - 1> which in this case seems to be 0x7fffffffffff:> > #3 0x0000000060012731 in page_set_flags (start=140737488224256, > end=18446744073709551615, flags=32)> at /usr/ports/emulators/qemu-devel-20100323a/work/qemu-snapshot-20100323_20/exec.c:2426> 2426 assert(end < ((abi_ulong)1 << L1_MAP_ADDR_SPACE_BITS));> (gdb) i li 2426> Line 2426 of "/usr/ports/emulators/qemu-devel-20100323a/work/qemu-snapshot-20100323_20/exec.c" starts at address 0x60012662 <page_set_flags+34>> and ends at 0x60012675 <page_set_flags+53>.> (gdb) disassemble 0x60012662 0x60012675> Dump of assembler code from 0x60012662 to 0x60012675:> 0x0000000060012662 <page_set_flags+34>: mov $0x7fffffffffff,%rax> ^^^^^^^^^^^^^^> 0x000000006001266c <page_set_flags+44>: cmp %rax,%rsi> 0x000000006001266f <page_set_flags+47>: ja 0x60012718 <page_set_flags+216>> End of assembler dump.> (gdb) q
Ok sorry about the confusion, this is a different problem, I just looked
at the value of start, it seems to be:
(gdb) p start
$2 = 0x7ffffffe0000
So I'd say the real problem is page_set_flags() has a bug that makes
it allocate too much if the range is the last allowed page...
Cheers,
Juergen

On 03/30/2010 01:42 PM, Juergen Lock wrote:
> So I'd say the real problem is page_set_flags() has a bug that makes> it allocate too much if the range is the last allowed page...
It doesn't, as far as I can see. I added this range by hand to page_init
and the effect was exactly as I supposed on a linux host -- 2 pages
allocated to handle the 3-level page table. No out of memory.
r~

> Oh this is happening with x86_64-bsd-user on the same arch so I'd say> (abi_ulong)-1 would be the same as ~0ul (and still cause the assert.)
No. These two are different when sizeof(abi_ulong) < sizeof(long).
Paul

On Thu, Apr 01, 2010 at 11:59:11AM +0000, Paul Brook wrote:
> > Oh this is happening with x86_64-bsd-user on the same arch so I'd say> > (abi_ulong)-1 would be the same as ~0ul (and still cause the assert.)> > No. These two are different when sizeof(abi_ulong) < sizeof(long).
Yeah sorry I meant in this case that caused the assert. (x86_64-bsd-user
on same-arch host.)
Cheers,
Juergen