user-mode-linux-devel

Hi people,
Finally, we got my university interested in doing a Linux practicum. I showed
them UML and they decided to use that, in combination with Putty and
Cygwin-X-server on Windows Desktops. So now they need a central server that
people can use for their work. Work will consist of: Installing the stuff,
configuring it, network-setup and -maintenance... Not really difficult stuff.
They asked me for a hardware spec of a server that could handle 30 clients.
I'm not really "into" hardware, so I hoped you would be willing to advise me
in this matter?
Things I thought out myself:
- Better have about 2GB of RAM, so each user can be assigned 64MB.
- Instead of 1x80GB SCSI HD, better have 8x10GB SCSI HD, for better
I/O-performance
- Probably a 2xAMD Athlon MP 2000+
- A full-duplex, 100MBit NIC ;-)
Any comments? Did I leave anything out? Am I on a wrong track here?
Help is appreciated!
--
Regards,
Tim Stoop
PGP public key: finger cvd@...
Random quote/fortune:
He laughs at every joke three times... once when it's told, once when it's
explained, and once when he understands it.

Thirty remote X sessions via UML might be a bit sluggish...
Cygwin X is already slow when connecting to a straight Linux box. Top
that off with only 64MB of RAM per person, and you're going to have some
very impatient users on your hands I think.
My experience with Cygwin X may not be representative, as I haven't had
time to experiment or trace the slowdown, but it seems to run at less
than half the speed of a remote X session running on Linux...And the
memory required to bring up a whole desktop in addition to X and all of
the Windows stuff may be the straw that breaks the camels back.
Tim Stoop wrote:
> Hi people,
>
> Finally, we got my university interested in doing a Linux practicum. I showed
> them UML and they decided to use that, in combination with Putty and
> Cygwin-X-server on Windows Desktops. So now they need a central server that
> people can use for their work. Work will consist of: Installing the stuff,
> configuring it, network-setup and -maintenance... Not really difficult stuff.
>
> They asked me for a hardware spec of a server that could handle 30 clients.
> I'm not really "into" hardware, so I hoped you would be willing to advise me
> in this matter?
>
> Things I thought out myself:
> - Better have about 2GB of RAM, so each user can be assigned 64MB.
> - Instead of 1x80GB SCSI HD, better have 8x10GB SCSI HD, for better
> I/O-performance
> - Probably a 2xAMD Athlon MP 2000+
> - A full-duplex, 100MBit NIC ;-)
>
> Any comments? Did I leave anything out? Am I on a wrong track here?
>
> Help is appreciated!
>
--
Joe Cooper <joe@...>
Web caching appliances and support.
http://www.swelltech.com

Good evening, all,
On Sat, 21 Sep 2002, Tim Stoop wrote:
> Finally, we got my university interested in doing a Linux practicum. I showed
Fantastic!
> them UML and they decided to use that, in combination with Putty and
> Cygwin-X-server on Windows Desktops. So now they need a central server that
> people can use for their work. Work will consist of: Installing the stuff,
> configuring it, network-setup and -maintenance... Not really difficult stuff.
But somewhat time consuming. :-)
> They asked me for a hardware spec of a server that could handle 30 clients.
> I'm not really "into" hardware, so I hoped you would be willing to advise me
> in this matter?
This really comes down to the question raised by another poster;
are you looking to have an entire gnome/kde session with all the applets,
widgets, and apps all running on the uml server, with the client
responsible for the X display, and having a classroom of 30 people doing
this simultaneously?
Please don't take offense when I suggest the above isn't likely to
work. If you're truly running 30 live desktops with multiple
applications, each one is going to get 1/30 of 4Ghz; 133mhz, some of which
will be taken up but UML and host overhead, leaving an effective CPU of
60-90mhz. Nobody's going to be happy with that for X apps.
Also, the 30 simultaneous heavy loads are going to be killed by
memory long before CPU. Desktops like that like lots of memory, or
they'll be swapping all the time. 5 or 10 people with moderate swap loads
are going to kill any disk bandwidth you need for loading apps and data.
So either these have to be otherwise relatively quiet UML's where
a few people with full desktops are likely to be logged in at a time, or
ones where the UML's are less heavily loaded.
> Things I thought out myself:
> - Better have about 2GB of RAM, so each user can be assigned 64MB.
More. More. More. :-)
4G as a starting point. More if you can get it. If you can't
afford it, slow down the CPU - I'm quite serious - and put the money
you save to more ram.
A moderately full desktop is going to swap quite a bit in 64M ram.
Given that every disk access makes the program in question wait an
extremely long time in computer terms, you're far better off having a lot
of memory to hold things in ram and cache the disk.
> - Instead of 1x80GB SCSI HD, better have 8x10GB SCSI HD, for better
> I/O-performance
I went with 4x 120G IDE, raid 5'd for 360G usable.
You have the right idea, but you need to pay attention to _how_
those drives are set up. Here are the choices.
- Raid 0 - striping
The data is written in chunks, chunk 0 goes on the first drive,
chunk 1 to the second, etc., then you wrap around to the first again.
For N physical drives, you get N in capacity. The data is spread over all
drives, so reads and writes speed up.
You _lose_ reliability. If each drive has a mean time between
failures of 10000 hours, a 5 drive array has a mean time between failures
of 2000 hours. There's probably more math in there somewhere, but the
point that the failure of a single drive loses all your data is still
valid.
Effective capacity, N drives.
- Raid 1 - mirroring
All data is written to both drives in the pair (more than two
drives? set them up as N/2 pairs). For N physical drives, you get N/2 in
storage. Writes are a little slower, as the data has to go to two drives,
but reads speed up significantly as they can come off the drive that's
idle or whose head is closer to the data.
MTBF is doubled.
Effective capacity, N/2
Raid 5 - parity data spread over all drives
Data is written to all N-1 drives, and a checksum of the data on
those is written to the last drive (actually, the parity is spread out
over all of the drives, but you still get N-1 drives of usable space).
Since two drives have to die before you lose any data, MTBF is
doubled again. Reads speed up because the data can stream off multiple
drives. Writes slow down by a potential factor of 4 or more because to
write a single chunk, you have to read the old chunk, the old parity, and
write the new chunk and new parity.
The drives should be identical, because you can run into wierd
spindle sync effects if they spin at different speeds.
Effective capacity, N-1.
Also, you'll need to think about software vs. hardware raid. I
paid about $200 for a 3ware raid controller that will do the major raid
flavors. I went with raid 5 to protect my data. Now my CPU's are free to
work on user apps instead of calculating raid checksums.
One last quirk for any raid >=1; the system will run more slowly
when reconstructing the array after a system crash or a dead drive is
replaced. For raid 0, when a drive dies, everything disappears.
> - Probably a 2xAMD Athlon MP 2000+
That was my choice, though I went with the 1.6ghz 1900+.
> - A full-duplex, 100MBit NIC ;-)
> Any comments? Did I leave anything out? Am I on a wrong track here?
The web site describing the project is
http://www.stearns.org/slartibartfast/ . Take a look at
http://www.stearns.org/slartibartfast/uml-coop.current.html ; the
document's not done yet. I'm up to 24 virtual machines on the shared box.
Nobody's doing X on them at the moment (not because they can't but just
because nobody's connected to the machine with a high speed connection.
I would suggest that you could probably feed up, say, 6-8 X
desktop users on a dual 2ghz athlon machine with 2G or more of ram. Make
that a quad CPU w/ 4G ram and you can probably handle 16-20.
If I were in your shoes, I'd look at buying 3 dual cpu/4G machines
with striped (raid 0) storage; back them up to each other. Put 10 client
machines on a lan with each UML server. You might want to consider a
network setup with a gigabit nic in the uml server talking to a gigabit
port on a switch (with the clients talking to the switch at 100mbit) so
the clients aren't fighting over 100mbit for lots of graphical data.
Although you might get away with just a straight 100mbit switch, not sure.
One last note. Keep in mind that Jeff and the other kernel
devlopers have done a great job, but there a still quirks to be worked
through when you get into a project like this.
- UML on SMP occasionally loses signals.
- There's a thundering herd of UML's waking up every three minutes
for 15-30 seconds.
- Dual AMD's run hot; my box was held up for a week while they
ordered new fans for it.
- Jeff's cleaning up some quirks in having lots of (>~10) umls
simultaneously (he found another one today and knows how to fix it -
whoopee!).
Please feel free to get in touch with me if you'd like to talk on
the phone about it. I'd be glad to answer your questions and help out
with the planning.
Cheers,
- Bill
---------------------------------------------------------------------------
"Ironically, DeCSS was published on the Web by a U.S. court (as
evidence) as a result of legal action against people who posted DeCSS on
the Web. Oops."
-- Sandy McMurray, readme@...
http://canoe.ca/TechNews/column_readme.html
--------------------------------------------------------------------------
William Stearns (wstearns@...). Mason, Buildkernel, named2hosts,
and ipfwadm2ipchains are at: http://www.stearns.org
--------------------------------------------------------------------------

On Sat, Sep 21, 2002 at 01:54:43AM -0400, William Stearns wrote:
> If I were in your shoes, I'd look at buying 3 dual cpu/4G machines
> with striped (raid 0) storage; back them up to each other. Put 10 client
> machines on a lan with each UML server.
At that point, why not just buy each person a low-end Linux box of
their own?
Seriously. The box you are describing is insanely expensive. You're
talking about $3k, maybe more, if I had to guess. For the cost of
each one of those, you could easily build or buy 6 "low-end" (likely
>1GHz Celeron or Duron) systems with 128MB RAM and say a 40GB disk.
(I personally sell around 1GHz Celerons like that with no OS for
$450. Anyone could build the same box by hitting pricewatch.com for
$350 or so.) Load those boxes down with around 1GB RAM, and you could
"cluster" them, each running 4-6 UML sessions, avoiding the need to
actually put a box at each desk.
So, seriously, what do you gain by running all that on one big monster
box instead of lots of little cheap boxes?
For that matter, if you already have Windows boxes that people can
use, wouldn't it make as much sense to just dual-boot them, or maybe
add removable hard drive bays and swap the Windows drive for a Linux
drive occasionally? Granted, that doesn't give you 100% uptime, but
still...
Don't get me wrong. I'm (hopefully obviously) a big fan of UML, but
everything has its place...
I suppose a lot of it depends on the real workload they expect, but if
that workload is going to be 30 X users at the same time, then $10k or
so spent on 30 individual systems would be my choice...
Steve
--
steve@... | Southern Illinois Linux Users Group
(618)398-7360 | See web site for meeting details.
Steven Pritchard | http://www.silug.org/

Op Saturday 21 September 2002 08:21, schreef Steven Pritchard:
> At that point, why not just buy each person a low-end Linux box of
> their own?
Because that would give them root rights and the ability to access the
Windows-fs on that dual-boot box. The institute isn't thrilled about that
(computer science students have a way of breaking things). They need to do
network-stuff and even administrating a system, so they can't go without root
rights.
Or... (thinking as I type)... we could make the machines all dual-boot, don't
give them root rights, but have UML installed, so they can have their own
environment... Wow. Why didn't I think about this before? No need to buy
hardware at all! Just reinstall the stuff...
> So, seriously, what do you gain by running all that on one big monster
> box instead of lots of little cheap boxes?
Well, actually, my study-association was hoping it would gain the rights to
one virtual server on it... When classes are running, it would be slow... but
when it's the only UML-process........ I rest my case.
But gain for the study-association isn't my main goal, so I'll really
consider this idea.
> I suppose a lot of it depends on the real workload they expect, but if
> that workload is going to be 30 X users at the same time, then $10k or
> so spent on 30 individual systems would be my choice...
And I do think you have a very valid point. I'll take this into my
considerations. Probably "Chapter 1 - Use existing infrastructure"-stuff.
Thanks for the input!
--
Regards,
Tim Stoop
PGP public key: finger cvd@...
Random quote/fortune:
A fool's brain digests philosophy into folly, science into superstition, and
art into pedantry. Hence University education. -- G. B. Shaw

Op Saturday 21 September 2002 07:54, schreef William Stearns:
> On Sat, 21 Sep 2002, Tim Stoop wrote:
> > Finally, we got my university interested in doing a Linux practicum. I
> > showed
>
> Fantastic!
Yeah it is! It took me one year of lobbying, but finally, they're listening.
And trust me, lobbying as a student is very difficult :(
> > Cygwin-X-server on Windows Desktops. So now they need a central server
> > that people can use for their work. Work will consist of: Installing the
> > stuff, configuring it, network-setup and -maintenance... Not really
> > difficult stuff.
>
> But somewhat time consuming. :-)
Ow, that's doesn't matter. I explained it incorrectly, they're going to be
used as a study-environment. They will learn how to install packages (and
maybe a complete system, how to configure it, etc. etc. And all in a save
environment, without giving away root-rights. That last thing (root-rights)
was a major show-stopper. I considered the SELinux-kernel, but in the end I
think UML is far better. Gives more options, too. (My theory: You learn best
from mistakes.)
> This really comes down to the question raised by another poster;
> are you looking to have an entire gnome/kde session with all the applets,
> widgets, and apps all running on the uml server, with the client
> responsible for the X display, and having a classroom of 30 people doing
> this simultaneously?
Jupz. But only one connection per UML-server, though. And probably not all
widgets and applets available. They need to learn how to use it comfortably.
If they want to (really) create a nice desktop with all kinds of widgets, my
study-association sells cheap Mandrake/RedHat/Debian-CD's :)
> Please don't take offense when I suggest the above isn't likely to
> work. If you're truly running 30 live desktops with multiple
> applications, each one is going to get 1/30 of 4Ghz; 133mhz, some of which
> will be taken up but UML and host overhead, leaving an effective CPU of
> 60-90mhz. Nobody's going to be happy with that for X apps.
Aargh, indeed. I didn't make this calculation yet :( So dual fails, let's
skip to quad. Quad Xeon, then, because quad AMD is still in it's juvenile
state (I heard).
> Also, the 30 simultaneous heavy loads are going to be killed by
> memory long before CPU. Desktops like that like lots of memory, or
> they'll be swapping all the time. 5 or 10 people with moderate swap loads
> are going to kill any disk bandwidth you need for loading apps and data.
Indeed. So I'd better go for 128MB per session. I never tried it, but does
Linux work without swap? For this scenario?
At first, I just want them to learn how to setup networking and Apache and
all. The GUI is just to make them feel a little more comfortable (people run
at the sight of CLI around these parts, even at a Informatics-university, I'm
afraid).
> > Things I thought out myself:
> > - Better have about 2GB of RAM, so each user can be assigned 64MB.
>
> More. More. More. :-)
> 4G as a starting point. More if you can get it. If you can't
> afford it, slow down the CPU - I'm quite serious - and put the money
> you save to more ram.
Hm. The thing is, I don't know the budget :) I'm trying to make a machine as
ligt as possible, so they'll finally start a pilot. If they like it, I'm
going to recommend more RAM and more machines for the same stuff. Maybe in
time, if UML is ported, a quad UltraSPARC III :) I'm told that would give me
lots of possibilities :)
I'm just a student trying to convince management that this is viable and a
good idea, too. People shouldn't be using only M$-stuff at a computer science
education...
> I went with 4x 120G IDE, raid 5'd for 360G usable.
RAID is a good option indeed...
> - Raid 1 - mirroring
> All data is written to both drives in the pair (more than two
> drives? set them up as N/2 pairs). For N physical drives, you get N/2 in
What do you mean with setting them up as N/2 pairs? Raid 1 within Raid 1?
> Also, you'll need to think about software vs. hardware raid. I
> paid about $200 for a 3ware raid controller that will do the major raid
> flavors. I went with raid 5 to protect my data. Now my CPU's are free to
> work on user apps instead of calculating raid checksums.
I asked around in my study-association (an oasis for people who don't want
learn other OSes than M$, the only room in the entire building with a
Linux/BSD/Mac-network) and they said: "Software raid isn't raid." :) So it'll
be hw-raid.
> The web site describing the project is
> http://www.stearns.org/slartibartfast/ . Take a look at
> http://www.stearns.org/slartibartfast/uml-coop.current.html ; the
> document's not done yet. I'm up to 24 virtual machines on the shared box.
I'll look at it monday. Sounds very interesting.
> I would suggest that you could probably feed up, say, 6-8 X
> desktop users on a dual 2ghz athlon machine with 2G or more of ram. Make
> that a quad CPU w/ 4G ram and you can probably handle 16-20.
Hmm... I'll take that into my report.
> If I were in your shoes, I'd look at buying 3 dual cpu/4G machines
> with striped (raid 0) storage; back them up to each other. Put 10 client
> machines on a lan with each UML server.
I think for a pilot, 30 people is just a bit much. I'll think I'll tome it
down to 10, on one dual cpu/2G machine. Probably with raid 0 or raid 1.
> You might want to consider a
> network setup with a gigabit nic in the uml server talking to a gigabit
> port on a switch (with the clients talking to the switch at 100mbit) so
> the clients aren't fighting over 100mbit for lots of graphical data.
> Although you might get away with just a straight 100mbit switch, not sure.
I'll probably try without gigabit first. But thanks for the suggestions!
> One last note. Keep in mind that Jeff and the other kernel
> devlopers have done a great job, but there a still quirks to be worked
> through when you get into a project like this.
Yeah, but I've been monitoring the list for a few weeks now and the program
since it was on LinuxToday the first time and I've seen the speed at which
issues are being handled. The team is doing a great job. By the time my
project has three servers, all these quirks will be gone ;-) Holland is a
bureaucracy, even at University :(
> Please feel free to get in touch with me if you'd like to talk on
> the phone about it. I'd be glad to answer your questions and help out
> with the planning.
Thanks for the offer, I'll keep that in mind. And thanks for the input, this
is great help!
--
Regards,
Tim Stoop
PGP public key: finger cvd@...
Random quote/fortune:
Our missions are peaceful -- not for conquest. When we do battle, it is only
because we have no choice. -- Kirk, "The Squire of Gothos", stardate 2124.5