Thread view

Hello fellow UML users :)
I had the privilege of meeting some of you already.
Thanks for your support, write-ups and all the stuff
you do that makes UML nicer/easier to use.
I have a rackserver in colocation, on which I am
running instances of uml.
One of them is doing fine and hasn't needed a reboot
for some time now. It doesn't have any significant
load either, it's just running tinydns as a nameserver.
The other host is running Apache and mysql and
is a usermachine where sites are hosted.
It needs a reboot every two hours :(
It just locks up, its not able to get to
it through the network, nor through the
uml_mconsole...
I need to remove the tap device and respawn
the uml to get it back in to place.
If I 'forget' to remove and reinstantiate
the tap device, networking simply fails to
work.
I'm running 2.4.20 vanilla on the host,
compiled with gcc 3.2 and -march athlon-mp.
Its a dual athlon 1600, with 1Gb Ram and
highmem enabled.
The uml kernels are 2.4.19-17um, the nameserver
with 64M of RAM and the server that likes
to lockup with 384MB of RAM because 512M
RAM makes the uml segfault. Both the host
and the user mode linux instances are running
Debian testing and they use devfs.
If you need any more information,
I will be happy to provide it.
Sorry if I run a version of uml that
is out of date or I missed some obvious
patch. I would like to hear it if this
is the case.
best regards
Vincent Touquet

On Thu, Dec 12, 2002 at 12:27:16AM +0100, Vincent Touquet wrote:
>The other host is running Apache and mysql and
>is a usermachine where sites are hosted.
>It needs a reboot every two hours :(
>It just locks up, its not able to get to
>it through the network, nor through the
>uml_mconsole...
(cut)
Sorry for the ambiguous use of 'host'
I obviously mean the second uml instance
that is running on the rackserver (host).
So I have a smp rack server running
two uml instances, with one misbehaving.
regards,
Vincent

On Wednesday 11 December 2002 23:27, Vincent Touquet wrote:
> The other host is running Apache and mysql and
> is a usermachine where sites are hosted.
> It needs a reboot every two hours :(
> It just locks up, its not able to get to
> it through the network, nor through the
> uml_mconsole...
Our experience is that skas mode UMLs are marginally stabler than tt mode
UMLs; make sure you have the skas patch applied to your host kernel and that
your UML kernel is using it. skas-mode UMLs also allow you to send more
helpful bug reports to Jeff when things go wrong; you can run gdb on the host
and "attach 12345" where 12345 is the process number of the hung UML, and
then "bt" for a helpful backtrace. tt mode UMLs can't provide this as
easily, and are much slower too.
[snip]
> Sorry if I run a version of uml that
> is out of date or I missed some obvious
> patch. I would like to hear it if this
> is the case.
In general, you are *always* running an outdated UML patch :-) -17 is pretty
old though, I think we're up to -33 or -34 now. My impression is that Jeff
is largely uninterested in bug reports from old kernels so keep up if you
want to send helpful reports.
--
Matthew Bloch Bytemark Computer Consulting Limited
http://www.bytemark.co.uk/
tel. +44 (0) 8707 455026

>>>>> "Vincent" == Vincent Touquet <vincent@...> writes:
Vincent> Hello fellow UML users :) I had the privilege of meeting
Vincent> some of you already.
Without reading all your mail, I have had some stablilty problems with
anything newer than 19-16 at least. I haven't tried -17 though I'm
testing -15 on one host using 20-rc1 host and that seems more stable
than 16 was. My most stable so far is 8 but I haven't tried 9->14.
I'm now running 19-36 on a 2.4.20 host with Jeff's SKAS patches
(Thanks again to Jeff) and its been up for 2 days so far.
Sincerely,
Adrian
--
Your mouse has moved.
Windows NT must be restarted for the change to take effect.
Reboot now? [OK]

Hello,
> The other host is running Apache and mysql and
> is a usermachine where sites are hosted.
> It needs a reboot every two hours :(
Do you see an error on the uml's console after it has died?
> I need to remove the tap device and respawn
> the uml to get it back in to place.
> If I 'forget' to remove and reinstantiate
> the tap device, networking simply fails to
> work.
Yes, this is normal. It seems some uml crashes screw the tap device.
> The uml kernels are 2.4.19-17um
I'd try the latest which is cureently -37um and if that doesn't work, I
always find -8um is a good bet as this is before Jeff did all the skas
and smp work.
As Matthew already said, if you use -37 and put the skas3 patch + enable
/proc/mm on your host then you can use SKAS uml's which are faster.
> to lockup with 384MB of RAM because 512M
> RAM makes the uml segfault.
You need to enable CONFIG_KERNEL_HALF_GIGS or HIGHMEM to use that.
Apparently HIGHMEM is slow with tt but should be fine under skas.
Bye for Now,
Ian

Hello,
> > Yes, this is normal. It seems some uml crashes screw the tap device.
> Doesn't sound normal to me. Sounds like a bug to me.
ok, normal was a bad choice of words :)
What I ment by that was that i'd seen this behaviour many times after a
uml crash and david mentioned he'd had the same problem so it's a known
thing.
Bye for Now,
Ian
\|||/
(o o)
/---------------------------ooO-(_)-Ooo---------------------------\
| Ian Chilton Web: http://www.ichilton.co.uk |
| E-Mail: ian@... Backup: ian@... |
|-----------------------------------------------------------------|
| There are 10 types of people in the world: |
| Those who understand binary, and those who don't. |
\-----------------------------------------------------------------/