From eugen at leitl.org Wed Mar 2 05:51:11 2005
From: eugen at leitl.org (Eugen Leitl)
Date: Wed, 2 Mar 2005 11:51:11 +0100
Subject: [Beowulf] Re: [Bioclusters] error while running mpiblast (fwd from
landman@scalableinformatics.com)
Message-ID: <20050302105110.GC13336@leitl.org>
----- Forwarded message from Joe Landman -----
From: Joe Landman
Date: Wed, 02 Mar 2005 00:25:05 -0500
To: "Clustering, compute farming & distributed computing in life science informatics"
Cc:
Subject: Re: [Bioclusters] error while running mpiblast
User-Agent: Mozilla Thunderbird 1.0 (X11/20041207)
Reply-To: "Clustering, compute farming & distributed computing in life science informatics"
James Cuff wrote:
>"Iam running this on SGI multiprocessor(numa)",
>
>you are running on a single shared (well near unified, and SGI do this
>very, very well) memory server with, as you said and appear to understand
>shared storage...
>
>*sigh*
>
>What on earth are you going to gain from MPI? Standard NCBI threads
>should do for you just fine, or maybe I've been smoking the funny stuff.
Hi James:
it is quite possible that mpiblast will scale better than NCBI blast
on this system. mpi forces you to pay attention to locality of
reference, so you tend to do a good job partitioning your code (that is,
if it scales). NCBI is built with pthreads, and I haven't seen it scale
much beyond about 10 CPUs on an SMP. The coarser grain of the mpiblast
partitioning (the pthread partitioning is very fine grain) will very
likely result in better scalability on a NUMA.
Not only that, but large multicpu NUMA's have problems with memory
hotspots. I remember in the Origin days we used to play games with
DPLACE directives and whatnot else to control memory layout, replication
of pages, etc. This was under Irix, and there were rich sets of tools
to help. I don't think many of them are available under Linux right now
(possibly in the SGI propack). You don't see much a problem in 2/4 way
systems. It becomes serious when you load data into a page, and you
start getting 16 requestors for that page. Page migration is not a win
here. readonly page replication can be a huge win here. Luckily, with
mpi, all references are local to begin with ...
That said, I don't have ready access to one, so I cannot test this
hypothesis, though I might just throw together a BBS experiment to test
this. I'd love to play with a nice 9MB cache machine. This would be a
sweet blast engine :) Expensive... yes, but running out of cache is a
"good thing" (TM).
>If you _do_ happen to have multiple NUMA's in a cluster, (1) you are very
>lucky and (2) you should the still listen to Joe's advice... Local is
>only local so far, try:
>
> Shared=/home/kalyani/toolkit/ncbi
> Local=/tmp/kalyani_mpiblast/
>
>(or as Joe maybe put better)
>
> Shared=/home/kalyani/toolkit/ncbi
> Local=/mylocalfilesystemthatnoonewillmesswith/kalyani_mpiblast/
Lucas sent me a note indicated that in 1.3.0 they allow for shared and
local to coexist. Aaron/Lucas, if you are about, could you clarify some
of this? I don't want to lead people astray (and I will need to update
the SGE tool).
>
> WFM, YMMV..
Note: We have not built the mpiblast RPM for Itanium (nor for that
matter, any of our RPMs). Is there any interest in this? Curious.
Joe
>
>Best,
>
>J.
>
>--
>James Cuff, D. Phil.
>Group Leader, Applied Production Systems
>Broad Institute of MIT and Harvard. 320 Charles Street,
>Cambridge, MA. 02141. Tel: 617-252-1925 Fax: 617-258-0903
>
>
>
>
>_______________________________________________
>Bioclusters maillist - Bioclusters at bioinformatics.org
>https://bioinformatics.org/mailman/listinfo/bioclusters
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 734 786 8452
cell : +1 734 612 4615
_______________________________________________
Bioclusters maillist - Bioclusters at bioinformatics.org
https://bioinformatics.org/mailman/listinfo/bioclusters
----- End forwarded message -----
--
Eugen* Leitl leitl
______________________________________________________________
ICBM: 48.07078, 11.61144 http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE
http://moleculardevices.org http://nanomachines.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL:
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From eugen at leitl.org Wed Mar 2 12:30:08 2005
From: eugen at leitl.org (Eugen Leitl)
Date: Wed, 2 Mar 2005 18:30:08 +0100
Subject: [Beowulf] 2.6.11 is out; with InfiBand support
Message-ID: <20050302173008.GY13336@leitl.org>
Speaking of InfiniBand, I presume there are still no motherboards with IB
ports onboard?
http://www.internetnews.com/dev-news/article.php/3485401
February 24, 2005
Linux Kernel 2.6.11 Supports InfiniBand
By Sean Michael Kerner
The Linux world is bracing for the final release of the new Linux 2.6.11
kernel, which will include a long list of driver updates and patches, with
InfiniBand support perhaps being one of most interesting new additions.
Late last night, Linux creator Linus Torvalds issued the fifth release
candidate for the 2.6.11 kernel. The first 2.6.11 RC was issued on Jan. 12;
the second on Jan 21; the third on Feb. 2; and the fourth on Feb. 12.
In the RC5 posting, Torvalds indicated that it was likely the last RC before
the final release.
"Hey, I hoped -- rc4 was the last one, but we had some laptop resource
conflicts, various ppc TLB flush issues, some possible stack overflows in
networking and a number of other details warranting a quick -- rc5 before the
final 2.6.11," Torvalds wrote.
"This time it's really supposed to be a quickie, so people who can, please
check it out, and we'll make the real 2.6.11 asap."
The long list of updates in the 2.6.11 kernel includes architecture updates
for x86-64, ia64, ppc, arm and mips, as well as updates to ACPI (define), DRI
(Direct Rendering Infrastructure, which permits direct access to graphics
hardware for X Window System users), ALSA (Advanced Linux Sound Architecture,
which provides MIDI and audio functionality to the Linux), SCSI (define) and
the XFS high-performance journaling filesystem.
The 2.6.11 kernel will also be significant in that it includes driver support
for the InfiniBand (define) interconnect architecture. InfiniBand, which is
derived from its underlying concept of "infinite bandwidth," is a switched
fabric interconnect technology for high-performance network devices that is
common in a number of supercomputer clusters.
The upcoming inclusion of InfiniBand support in the Linux kernel is a major
step according to the InfiniBand Trade Association.
"The inclusion of InfiniBand drivers in the upstream Linux kernel is a
significant milestone," Ross Schibler, CTO of InfiniBand vendor Topspin
Communications, told internetnews.com.
InfiniBand support was available previously in various Linux distributions,
but it wasn't part of the mainstream kernel.org Linux.
"This now means that anyone that downloads a kernel will have automatic
access to the software," explained Schibler. "It also means that any upcoming
distributions (Red Hat, SUSE, etc.) will have the software included on their
CDs. Previously SUSE had it on a distribution, but only in the 'unsupported'
directory."
Schibler sees the inclusion of InfiniBand as a testament to the maturation of
the technology.
"Now that the technology has matured to such a point that Linus has accepted
it into the kernel, the way is paved for greater distribution of the code and
accelerated deployment of the technology," Schibler said.
The previous Linux kernel.org release, version 2.6.10 was issued on Dec. 24
after two release candidates. Linux distribution began including the 2.6.10
thereafter with Red Hat's Fedora Project being one of the first.
Fedora Core 3 initially shipped with the 2.6.9 kernel and then upgraded to
the 2.6.10 kernel on Jan 13. Mandrakelinux's 10.2 Beta 3 also includes the
2.6.10 release. SUSE Linux 9.2 currently includes the 2.6.8 kernel.
Including the most recent kernel into a distribution is not a particularly
easy task. The upcoming Debian, code-named Sarge, will only ship with the
2.6.8 kernel. In a release update e-mail, Debian Sarge release manager
Andreas Barth related that a meeting was recently held to review the status
of which kernel they would include.
"The team leads involved eventually decided to stay with kernel 2.6.8 and
2.4.27, rather than bumping the 2.6 kernel to 2.6.10," Barth wrote. "This
decision was made upon review of the known bugs in each of the 2.6 kernel
versions; despite some significant bugs in the Debian 2.6.8 kernel tree,
these bugs were weighed against the additional delays that a kernel version
bump would introduce in the schedule for debian-installer RC3."
"As it happens, preparing 2.4 and 2.6 kernels with the security fixes for all
architectures took roughly two months from start to finish, during which time
preparation of the next debian-installer release candidate has been entirely
stalled," he added.
--
Eugen* Leitl leitl
______________________________________________________________
ICBM: 48.07078, 11.61144 http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE
http://moleculardevices.org http://nanomachines.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL:
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From gotero at linuxprophet.com Wed Mar 2 14:08:19 2005
From: gotero at linuxprophet.com (Glen Otero)
Date: Wed, 2 Mar 2005 11:08:19 -0800
Subject: [Beowulf] 2.6.11 is out; with InfiBand support
In-Reply-To: <20050302173008.GY13336@leitl.org>
References: <20050302173008.GY13336@leitl.org>
Message-ID: <5769e684325b11f3146f648fe06d327f@linuxprophet.com>
Arima and Iwill have mobos with IB LOM (Landed on Motherboard).
Glen
On Mar 2, 2005, at 9:30 AM, Eugen Leitl wrote:
>
> Speaking of InfiniBand, I presume there are still no motherboards with
> IB
> ports onboard?
>
> http://www.internetnews.com/dev-news/article.php/3485401
>
> February 24, 2005
> Linux Kernel 2.6.11 Supports InfiniBand
> By Sean Michael Kerner
>
> The Linux world is bracing for the final release of the new Linux
> 2.6.11
> kernel, which will include a long list of driver updates and patches,
> with
> InfiniBand support perhaps being one of most interesting new additions.
>
> Late last night, Linux creator Linus Torvalds issued the fifth release
> candidate for the 2.6.11 kernel. The first 2.6.11 RC was issued on
> Jan. 12;
> the second on Jan 21; the third on Feb. 2; and the fourth on Feb. 12.
>
> In the RC5 posting, Torvalds indicated that it was likely the last RC
> before
> the final release.
>
> "Hey, I hoped -- rc4 was the last one, but we had some laptop resource
> conflicts, various ppc TLB flush issues, some possible stack overflows
> in
> networking and a number of other details warranting a quick -- rc5
> before the
> final 2.6.11," Torvalds wrote.
>
> "This time it's really supposed to be a quickie, so people who can,
> please
> check it out, and we'll make the real 2.6.11 asap."
>
> The long list of updates in the 2.6.11 kernel includes architecture
> updates
> for x86-64, ia64, ppc, arm and mips, as well as updates to ACPI
> (define), DRI
> (Direct Rendering Infrastructure, which permits direct access to
> graphics
> hardware for X Window System users), ALSA (Advanced Linux Sound
> Architecture,
> which provides MIDI and audio functionality to the Linux), SCSI
> (define) and
> the XFS high-performance journaling filesystem.
>
> The 2.6.11 kernel will also be significant in that it includes driver
> support
> for the InfiniBand (define) interconnect architecture. InfiniBand,
> which is
> derived from its underlying concept of "infinite bandwidth," is a
> switched
> fabric interconnect technology for high-performance network devices
> that is
> common in a number of supercomputer clusters.
>
> The upcoming inclusion of InfiniBand support in the Linux kernel is a
> major
> step according to the InfiniBand Trade Association.
>
> "The inclusion of InfiniBand drivers in the upstream Linux kernel is a
> significant milestone," Ross Schibler, CTO of InfiniBand vendor Topspin
> Communications, told internetnews.com.
>
> InfiniBand support was available previously in various Linux
> distributions,
> but it wasn't part of the mainstream kernel.org Linux.
>
> "This now means that anyone that downloads a kernel will have automatic
> access to the software," explained Schibler. "It also means that any
> upcoming
> distributions (Red Hat, SUSE, etc.) will have the software included on
> their
> CDs. Previously SUSE had it on a distribution, but only in the
> 'unsupported'
> directory."
>
> Schibler sees the inclusion of InfiniBand as a testament to the
> maturation of
> the technology.
>
> "Now that the technology has matured to such a point that Linus has
> accepted
> it into the kernel, the way is paved for greater distribution of the
> code and
> accelerated deployment of the technology," Schibler said.
>
> The previous Linux kernel.org release, version 2.6.10 was issued on
> Dec. 24
> after two release candidates. Linux distribution began including the
> 2.6.10
> thereafter with Red Hat's Fedora Project being one of the first.
>
> Fedora Core 3 initially shipped with the 2.6.9 kernel and then
> upgraded to
> the 2.6.10 kernel on Jan 13. Mandrakelinux's 10.2 Beta 3 also includes
> the
> 2.6.10 release. SUSE Linux 9.2 currently includes the 2.6.8 kernel.
>
> Including the most recent kernel into a distribution is not a
> particularly
> easy task. The upcoming Debian, code-named Sarge, will only ship with
> the
> 2.6.8 kernel. In a release update e-mail, Debian Sarge release manager
> Andreas Barth related that a meeting was recently held to review the
> status
> of which kernel they would include.
>
> "The team leads involved eventually decided to stay with kernel 2.6.8
> and
> 2.4.27, rather than bumping the 2.6 kernel to 2.6.10," Barth wrote.
> "This
> decision was made upon review of the known bugs in each of the 2.6
> kernel
> versions; despite some significant bugs in the Debian 2.6.8 kernel
> tree,
> these bugs were weighed against the additional delays that a kernel
> version
> bump would introduce in the schedule for debian-installer RC3."
>
> "As it happens, preparing 2.4 and 2.6 kernels with the security fixes
> for all
> architectures took roughly two months from start to finish, during
> which time
> preparation of the next debian-installer release candidate has been
> entirely
> stalled," he added.
>
> --
> Eugen* Leitl leitl
> ______________________________________________________________
> ICBM: 48.07078, 11.61144 http://www.leitl.org
> 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE
> http://moleculardevices.org http://nanomachines.net
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
Glen Otero Ph.D.
Linux Prophet
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 5263 bytes
Desc: not available
URL:
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From hahn at physics.mcmaster.ca Wed Mar 2 18:09:09 2005
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Wed, 2 Mar 2005 18:09:09 -0500 (EST)
Subject: [Beowulf] 2.6.11 is out; with InfiBand support
In-Reply-To: <5769e684325b11f3146f648fe06d327f@linuxprophet.com>
Message-ID:
> Arima and Iwill have mobos with IB LOM (Landed on Motherboard).
given the choice between a $150 pcie IB nic and having it onboard,
I'd choose the separate card. I know the IB salesdroids always
say that getting onto the MB will change everything, but this
doesn't make sense. IB is completely different from onboard gigabit,
for instance, because there is no ubiquitous IB infrastructure
ready, waiting to be exploited.
the problem with "if you build it onboard, they will come" is also
the marginal cost. onboard gigabit is nearly the same cost as
onboard 100bT, very low, and you pretty much always want it.
onboard IB is noticably higher than onboard GBE, noticable in
absolute terms, and you definitely have no possible use for it
on many systems.
remember, most people don't even saturate GBE yet, and GBE
ports are damned cheap. GBE nics are free, and switch ports
are now down to $US 23/port:
http://froogle.google.com/froogle?q=netgear+GS748T&btnG=Search+Froogle
fundamentally, IB is still facing most of the same problems it always has:
- requires fairly expensive, unique infrastructure
- not the greatest physical layer: it's easy to wind up with
literally tons of IB cables.
- not clearly superior in performance vs alternatives.
- apparently designed by people who disliked existing technique
or were ignorant of it.
- not a drop-in replacement for alternatives.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From gotero at linuxprophet.com Wed Mar 2 18:21:19 2005
From: gotero at linuxprophet.com (Glen Otero)
Date: Wed, 2 Mar 2005 15:21:19 -0800
Subject: Fwd: [Beowulf] 2.6.11 is out; with InfiBand support
Message-ID: <0a773778f2243e80397520d276ba56b0@linuxprophet.com>
Begin forwarded message:
> From: Glen Otero
> Date: March 2, 2005 3:20:41 PM PST
> To: Bill Broadley
> Subject: Re: [Beowulf] 2.6.11 is out; with InfiBand support
>
>
> On Mar 2, 2005, at 3:17 PM, Bill Broadley wrote:
>
>> On Wed, Mar 02, 2005 at 11:08:19AM -0800, Glen Otero wrote:
>>> Arima and Iwill have mobos with IB LOM (Landed on Motherboard).
>>>
>>
>> Via pci-express?
>
> PCI-Express
>
>> Or via an HTX[1] slot?
>>
>> [1]
>> http://www.hypertransport.org/products/productdetail.cfm?RecordID=65
>>
>> When?
>
> Available now, according to Mellanox. I've seen pictures of the boards.
>>
>> --
>> Bill Broadley
>> Computational Science and Engineering
>> UC Davis
>>
>>
> Glen Otero Ph.D.
> Linux Prophet
>
>
Glen Otero Ph.D.
Linux Prophet
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 1172 bytes
Desc: not available
URL:
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From bill at cse.ucdavis.edu Wed Mar 2 18:17:53 2005
From: bill at cse.ucdavis.edu (Bill Broadley)
Date: Wed, 2 Mar 2005 15:17:53 -0800
Subject: [Beowulf] 2.6.11 is out; with InfiBand support
In-Reply-To: <5769e684325b11f3146f648fe06d327f@linuxprophet.com>
References: <20050302173008.GY13336@leitl.org>
<5769e684325b11f3146f648fe06d327f@linuxprophet.com>
Message-ID: <20050302231753.GA5857@cse.ucdavis.edu>
On Wed, Mar 02, 2005 at 11:08:19AM -0800, Glen Otero wrote:
> Arima and Iwill have mobos with IB LOM (Landed on Motherboard).
>
Via pci-express? Or via an HTX[1] slot?
[1] http://www.hypertransport.org/products/productdetail.cfm?RecordID=65
When?
--
Bill Broadley
Computational Science and Engineering
UC Davis
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From maillists at gauckler.ch Tue Mar 1 01:44:14 2005
From: maillists at gauckler.ch (Michael Gauckler)
Date: Tue, 01 Mar 2005 07:44:14 +0100
Subject: [Beowulf] MPI programming question: Interleaved MPI_Gatherv?
Message-ID: <1109659454.6544.2.camel@localhost.localdomain>
Dear List,
I would like to gather the data from several processes.
Instead of the comonly used stride, I want to interleave
the data:
Rank 0: AAAAA -> ABCDABCDABCDABCDABCD
Rank 1: BBBBB ----^---^---^---^---^
Rank 2: CCCCC -----^---^---^---^---^
Rank 3: DDDDD ------^---^---^---^---^
Since the stride of the receive type is indicated
in multpiles of its mpi_type, no interleaving is
possible (the smallest striping factor leads to
AAAAABBBBBBCCCCCDDDDD).
Is there a way to achieve this behaviour in an
elegant way, as MPI_Gather promises it? Or do
I need to do Send/Recv with self-aligned offsets?
Thank you for your help!
Michael
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From jrajiv at hclinsys.com Tue Mar 1 06:07:53 2005
From: jrajiv at hclinsys.com (Rajiv)
Date: Tue, 1 Mar 2005 16:37:53 +0530
Subject: [Beowulf] GRID APPLICATION
Message-ID: <012e01c51e4e$ea4b6860$0f120897@PMORND>
Dear All,
I have setup Globus 3.2 on two machines and I am able to submit job from
one machine to another. I have a basic doubt about what application to run
in GRID environments. Shouldn't the GRID application use resources of both
the GRID machines simultaniously. Are there any applications like this. So
far I am only running remote jobs from on machine to another - for eg I can
submit
and run LINPACK/GROMACS job from one master of a cluster to a master of
another cluster.
Regards,
Rajiv
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From jakob at unthought.net Tue Mar 1 10:51:34 2005
From: jakob at unthought.net (Jakob Oestergaard)
Date: Tue, 1 Mar 2005 16:51:34 +0100
Subject: [Beowulf] motherboards for diskless nodes
In-Reply-To: <1109415598.6688.21.camel@ip13.2214.h2.fosdem.lan>
References:
<1109319374.6055.17.camel@Vigor45>
<1109351864.2883.22.camel@localhost.localdomain>
<20050225183146.GA1563@greglaptop.internal.keyresearch.com>
<1109415598.6688.21.camel@ip13.2214.h2.fosdem.lan>
Message-ID: <20050301155134.GM347@unthought.net>
On Sat, Feb 26, 2005 at 10:59:57AM +0000, John Hearns wrote:
> On Fri, 2005-02-25 at 10:31 -0800, Greg Lindahl wrote:
>
> >
> > Doesn't make any sense; I have seen people describe such systems where
> > they download a disk image when a batch job wants a different software
> > load. It's certainly doable that way: it does have different tradeoffs
> > from the diskless case, but if it gives you a headache, it's probably
>
> I've always dreamed of using User Mode Linux images for this.
> In a Grid-based world, prepare a UML instance which has all the
> libraries and runtime to run your code. Ship it across the grid with
> your executable.
> The cluster at the receiving end can be running any distribution - it
> runs your UML in a sandbox.
Please see RFC 1925, corollary 6a:
It is always possible to add another level of
indirection.
Coming from truth 6:
It is easier to move a problem around (for example, by moving
the problem to a different part of the overall network
architecture) than it is to solve it.
Your UML is, as its name implies, a user-space application, just like
the real application you were actually trying to run. If your
application cannot be run on a given distro, I pretty much doubt your
UML (which is a very very complex user mode application) will run.
What you want is KISS: Keep It Simple (Stupid)
Don't link to a gazillion libraries if you don't have to. Link the
libraries statically when feasible (gives you a performance gain in many
cases anyway).
A statically linked application, or one with only glibc linked
dynamically, will run on very wide ranges of distributions.
Trust me on this; I make a living from selling an evil capitalistic
closed-source solution which needs to run on a very wide range of
distributions (and no, we do not link glibc statically because we're not
allowed to, but we keep our dependencies minimal and our binaries do run
on a very wide range of distributions).
>
> And before anyone says it, yes performance would be a dog,
> and I don't see how UML could access all those nice Myrinet and
> Infiniband cards. SO I'm definitely blue-skying.
Again; adding layers of indirection is rarely a solution.
--
/ jakob
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From peter at cs.usfca.edu Tue Mar 1 12:19:41 2005
From: peter at cs.usfca.edu (Peter Pacheco)
Date: Tue, 1 Mar 2005 09:19:41 -0800
Subject: [Beowulf] Re: Pi calculator
In-Reply-To: <42231589.4080706@scalableinformatics.com>
References: <42231589.4080706@scalableinformatics.com>
Message-ID: <20050301171941.GA5545@cs.usfca.edu>
On Mon, Feb 28, 2005 at 07:58:49AM -0500, Joe Landman wrote:
>
> >>2. Does anybody know of a program that will calculate pi, one digit at a
> >>time, infinitely that will run in parallel?
> >
> >
> >I don't know about one that will compute an infinite number of digits in
> >PI, but the computation of PI via the arctan series is trivially
> >partitionable in a variety of ways. You'll spend more time working to
> >sum and align the digits you get (as they obviously will have to be
> >obtained and manipulated piecewise as strings) than you will doing the
> >computation per se. It actually sounds like a decent exercise, as the
> >carry from small digits may have to propagate iteratively back to larger
> >ones as you extend the computation farther and farther.
> >
>
>
> http://mathworld.wolfram.com/PiDigits.html
> http://mathworld.wolfram.com/PiFormulas.html
> http://www.andrews.edu/~calkins/physics/Miracle.pdf
>
> and others.
>
> It is possible to calculate the digits individually using the Bailey et
> al algorithm.
>
> Joe
I wrote a short MPI program last summer that uses
the Bailey-Borwein-Plouffe algorithm and the GMP library
(http://www.swox.com/gmp/) to compute arbitrarily many digits of pi.
Jake, send me email (peter at cs.usfca.edu) if you want a copy.
Best wishes,
Peter Pacheco
Department of Computer Science
University of San Francisco
San Francisco, CA 94117
(415) 422-6630
peter at cs.usfca.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From steve_heaton at ozemail.com.au Tue Mar 1 19:21:04 2005
From: steve_heaton at ozemail.com.au (steve_heaton at ozemail.com.au)
Date: Wed, 2 Mar 2005 11:21:04 +1100
Subject: [Beowulf] So we will write our own book - next steps...
Message-ID: <20050302002104.EBYF8920.swebmail00.mail.ozemail.net@localhost>
G'day all
I humbly submit my A$0.02 as a novice Beowulf'er.
I don't have a problem with a series of collected articles. I agree it's a great way to keep the journal/book fresh. The personal styles of the authors doesn't present a challenge for me if the content is good quality. This list is a great example!
I'd like to see "something in front of the punters" rather than aiming for perfection with little output as a result. Esp as we're initally looking at a soft format.
Articles need not be long and involved. Some of the gems I've got from this list are 25 words or less ;) Another advantage of the article approach.
Once we get to "things", my suggestion for an outline (per topic) is:
-) What is it? (with a bit of background/history etc)
-) How does it work? (Roughly)
-) How to I install/use it?
-) Tricks and tips (solutions to common problems)
-) Where to find more info (net refs, books etc)
You're basic FAQ thang :)
Vendors are in but the editors wield a heavy hand on 'barrow pushing. The vendors on this list seem good on the education and rarely get pulled into a p'ing contest. Something rare and beautiful compared to other lists! They know their kit, I'd like their knowledge and experience. No doubt they'll be flooded with sales as a result ;)
Ability to download journal/book for offline reading is critical.
Editors are neutral moderators. Eg. They don't side on local HD v's net boot but will present (all) options without fear nor favour. The goal is to leave the reader in a position to make an informed decision :)
I have every intention to contribute. The words are vapour until it do! =)
Cheers
Stevo
This message was sent through MyMail http://www.mymail.com.au
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From ddw at dreamscape.com Wed Mar 2 00:02:48 2005
From: ddw at dreamscape.com (Dan Williams)
Date: Wed, 02 Mar 2005 00:02:48 -0500
Subject: [Beowulf] Re: So we will write our own book - next steps...
In-Reply-To: <200503012001.j21K0Yik026268@bluewest.scyld.com>
References: <200503012001.j21K0Yik026268@bluewest.scyld.com>
Message-ID: <422548F8.7010603@dreamscape.com>
The question has been raised as to addressing the needs of beginners, as well
as advanced people. I am about as beginner as you can get. I have never built
or used a cluster, and am a Linux newbie, besides. If a rank beginner chapter
is desired, I volunteer to write it, if someone can hold my hand while I turn
a pair of Pentium 100MHz motherboards and miscellaneous parts I have in my
junk pile into a working (2 node) cluster. I am pretty good at writing
non-fiction if it's a subject I know or can learn about, but as of now, I only
have the vaguest notions on how to make a functioning cluster. If there is
interest in including a chapter that is detailed enough but basic enough that
someone who knows nothing on the subject can learn enough to actually build a
functioning cluster from junk parts, then I'm your writer. I'll build a
"proof of concept" junkyard cluster and write about it, if someone can help me
figure out how.
DDW
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From eno at dorsai.org Wed Mar 2 04:00:43 2005
From: eno at dorsai.org (Alpay Kasal)
Date: Wed, 02 Mar 2005 04:00:43 -0500
Subject: [Beowulf] where can i learn to build a cluster machine?
In-Reply-To:
Message-ID: <0ICP00HHNVJ1JO@mta2.srv.hcvlny.cv.net>
Holy Cow... This message is a keeper. Thanks a million Robert.
Alpay Kasal
-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On
Behalf Of Robert G. Brown
Sent: Friday, February 25, 2005 12:54 PM
To: Starship Warrior
Cc: beowulf at beowulf.org
Subject: Re: [Beowulf] where can i learn to build a cluster machine?
...snip...
To give you the direct answer, it goes something like the following:
a) Hook systems into a common switched LAN e.g. an ethernet switch.
b) If possible use decent quality PXE-aware NICs
c) If possible use nodes with a decent amount of installed memory (>=
192 MB) although it is possible to get by with less, with effort.
d) Node hard disk is optional for at least some installation methods
(e.g. warewulf) but is useful and enables others.
e) At least one system NEEDS ample hard disk and will serve as a
"server" or "head node" to your cluster. This node will manage boot
images, the distro you wish to install, NFS or other shared filesystems,
authentication, and gives you a place to "login to the cluster". Note
that this is a sloppy requirement -- there are many different ways to
manage this and I'm just describing one of the simplest and most
straightforward ones.
...snip...
rgb
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From atp at piskorski.com Wed Mar 2 21:06:00 2005
From: atp at piskorski.com (Andrew Piskorski)
Date: Wed, 2 Mar 2005 21:06:00 -0500
Subject: [Beowulf] 2.6.11 is out; with InfiBand support
In-Reply-To:
References:
Message-ID: <20050303020600.GA56437@piskorski.com>
On Wed, Mar 02, 2005 at 06:09:09PM -0500, Mark Hahn wrote:
> > Arima and Iwill have mobos with IB LOM (Landed on Motherboard).
>
> given the choice between a $150 pcie IB nic and having it onboard,
> I'd choose the separate card. I know the IB salesdroids always
Except, a single PCI-X Infiniband card currently costs $1000 or so,
right? (That's for a 4x 2 port card, but Froogle does not seem to
know of any cheaper cards.)
http://h30094.www3.hp.com/product.asp?sku=2603660&jumpid=ex_r2910_frooglesmb/accessories
http://www.costcentral.com/proddetail/HP_NC570C/376158B21/F35425/froogle/
--
Andrew Piskorski
http://www.piskorski.com/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rene at renestorm.de Wed Mar 2 22:05:54 2005
From: rene at renestorm.de (rene)
Date: Thu, 3 Mar 2005 04:05:54 +0100
Subject: [Beowulf] MPI programming question: Interleaved MPI_Gatherv?
In-Reply-To: <1109659454.6544.2.camel@localhost.localdomain>
References: <1109659454.6544.2.camel@localhost.localdomain>
Message-ID: <200503030405.54791.rene@renestorm.de>
Hi Michael,
in my opinion, this would be a gather with lenght 1 but sended 4 times.
This seems to be the easiest and slowest way.
If Im not totally wrong your interleaving looks like an Alltoall followed by a
reduce operation, but why don't you sort the recv buffer afterwards?
Cu
Rene
> Dear List,
>
> I would like to gather the data from several processes.
> Instead of the comonly used stride, I want to interleave
> the data:
>
> Rank 0: AAAAA -> ABCDABCDABCDABCDABCD
> Rank 1: BBBBB ----^---^---^---^---^
> Rank 2: CCCCC -----^---^---^---^---^
> Rank 3: DDDDD ------^---^---^---^---^
>
> Since the stride of the receive type is indicated
> in multpiles of its mpi_type, no interleaving is
> possible (the smallest striping factor leads to
> AAAAABBBBBBCCCCCDDDDD).
>
> Is there a way to achieve this behaviour in an
> elegant way, as MPI_Gather promises it? Or do
> I need to do Send/Recv with self-aligned offsets?
>
> Thank you for your help!
>
> Michael
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From felix.rauch.valenti at gmail.com Wed Mar 2 20:00:49 2005
From: felix.rauch.valenti at gmail.com (Felix Rauch Valenti)
Date: Thu, 3 Mar 2005 12:00:49 +1100
Subject: [Beowulf] Re: Pi calculator
In-Reply-To: <20050301171941.GA5545@cs.usfca.edu>
References: <42231589.4080706@scalableinformatics.com>
<20050301171941.GA5545@cs.usfca.edu>
Message-ID: <4eafc81b050302170064486203@mail.gmail.com>
On Tue, 1 Mar 2005 09:19:41 -0800, Peter Pacheco wrote:
> I wrote a short MPI program last summer that uses
> the Bailey-Borwein-Plouffe algorithm and the GMP library
> (http://www.swox.com/gmp/) to compute arbitrarily many digits of pi.
> Jake, send me email (peter at cs.usfca.edu) if you want a copy.
If somebody really wants to spend zillions of cycles on calculating Pi
just for fun, you could also look for non-random patterns in Pi on the
way. Maybe you will become famous one day.
(insert reference to Carl Sagan's "Contact" here)
- Felix
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From joachim at ccrl-nece.de Thu Mar 3 04:14:40 2005
From: joachim at ccrl-nece.de (Joachim Worringen)
Date: Thu, 03 Mar 2005 10:14:40 +0100
Subject: [Beowulf] MPI programming question: Interleaved MPI_Gatherv?
In-Reply-To: <1109659454.6544.2.camel@localhost.localdomain>
References: <1109659454.6544.2.camel@localhost.localdomain>
Message-ID: <4226D580.2010206@ccrl-nece.de>
Michael Gauckler wrote:
> Dear List,
>
> I would like to gather the data from several processes.
> Instead of the comonly used stride, I want to interleave
> the data:
>
> Rank 0: AAAAA -> ABCDABCDABCDABCDABCD
> Rank 1: BBBBB ----^---^---^---^---^
> Rank 2: CCCCC -----^---^---^---^---^
> Rank 3: DDDDD ------^---^---^---^---^
>
> Since the stride of the receive type is indicated
> in multpiles of its mpi_type, no interleaving is
> possible (the smallest striping factor leads to
> AAAAABBBBBBCCCCCDDDDD).
>
> Is there a way to achieve this behaviour in an
> elegant way, as MPI_Gather promises it? Or do
> I need to do Send/Recv with self-aligned offsets?
Actually, I don't see an 'elegant' way to do this, either. The decision
between multiple MPI_Gatherv() calls and a Irecv/Send/Waitall construct
depends on the quality of the MPI implementation you use (MPI_Gatherv
can be optimized well for small amounts of data), the characteristics of
you interconnect (high latency gives more room for optimization) and the
number of processes you use. For small process numbers, you wont see
much of a difference anyway.
You could also try to gather all data on the root in separate buffers,
and then let this process send/recv to itself using the proper datatypes.
Finally, if this communication is not a significant part of your
runtime, you shouldn't spend much time optimizing it anyway.
Joachim
--
Joachim Worringen - NEC C&C research lab St.Augustin
fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rgb at phy.duke.edu Thu Mar 3 07:54:46 2005
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Thu, 3 Mar 2005 07:54:46 -0500 (EST)
Subject: [Beowulf] Re: Pi calculator
In-Reply-To: <4eafc81b050302170064486203@mail.gmail.com>
References: <42231589.4080706@scalableinformatics.com>
<20050301171941.GA5545@cs.usfca.edu>
<4eafc81b050302170064486203@mail.gmail.com>
Message-ID:
On Thu, 3 Mar 2005, Felix Rauch Valenti wrote:
> On Tue, 1 Mar 2005 09:19:41 -0800, Peter Pacheco wrote:
> > I wrote a short MPI program last summer that uses
> > the Bailey-Borwein-Plouffe algorithm and the GMP library
> > (http://www.swox.com/gmp/) to compute arbitrarily many digits of pi.
> > Jake, send me email (peter at cs.usfca.edu) if you want a copy.
>
> If somebody really wants to spend zillions of cycles on calculating Pi
> just for fun, you could also look for non-random patterns in Pi on the
> way. Maybe you will become famous one day.
> (insert reference to Carl Sagan's "Contact" here)
Just be sure that you look with a powerful statistical tool --
remembering those damnable typing monkeys. Pi is well known to have all
sorts of non-random-looking patterns in it. Distributed (as far as
all studies done to date that I found referenced on the web) completely
randomly...;-)
Wait! I see a cloud that looks like the Virgin Mary! Gotta go and
write the Enquirer...:-)
rgb
(Still haven't taken my medicine this morning, and Deadline hisself is
already pre-emptively hassling me for the column I haven't written yet
for May:-)
(I just HAVE to quit playing WoW until 2 am before cumulative sleep
deprivation slays me like a dragon did last night.)
(Hmmm, combine business with pleasure? Maybe I'll try to contact the
WoW folks and get some detail about their realm cluster. That would
make a nifty article for June...)
(Damn, my interior monologue isn't working this morning. Must
sleep...:-)
>
> - Felix
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rgb at phy.duke.edu Thu Mar 3 08:20:57 2005
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Thu, 3 Mar 2005 08:20:57 -0500 (EST)
Subject: [Beowulf] MPI programming question: Interleaved MPI_Gatherv?
In-Reply-To: <4226D580.2010206@ccrl-nece.de>
References: <1109659454.6544.2.camel@localhost.localdomain>
<4226D580.2010206@ccrl-nece.de>
Message-ID:
On Thu, 3 Mar 2005, Joachim Worringen wrote:
> Michael Gauckler wrote:
> > Dear List,
> >
> > I would like to gather the data from several processes.
> > Instead of the comonly used stride, I want to interleave
> > the data:
> >
> > Rank 0: AAAAA -> ABCDABCDABCDABCDABCD
> > Rank 1: BBBBB ----^---^---^---^---^
> > Rank 2: CCCCC -----^---^---^---^---^
> > Rank 3: DDDDD ------^---^---^---^---^
> >
> > Since the stride of the receive type is indicated
> > in multpiles of its mpi_type, no interleaving is
> > possible (the smallest striping factor leads to
> > AAAAABBBBBBCCCCCDDDDD).
> >
> > Is there a way to achieve this behaviour in an
> > elegant way, as MPI_Gather promises it? Or do
> > I need to do Send/Recv with self-aligned offsets?
What about RMA-like commands? MPI_Get in a loop? Since that is
controlled by the gatherer, one would presume that it preserves call
order (although it is non-blocking).
Or of course there are always raw sockets... where you have complete
control. Depending on how critical it is that you preserve this strict
interleaving order.
rgb
>
> Actually, I don't see an 'elegant' way to do this, either. The decision
> between multiple MPI_Gatherv() calls and a Irecv/Send/Waitall construct
> depends on the quality of the MPI implementation you use (MPI_Gatherv
> can be optimized well for small amounts of data), the characteristics of
> you interconnect (high latency gives more room for optimization) and the
> number of processes you use. For small process numbers, you wont see
> much of a difference anyway.
>
> You could also try to gather all data on the root in separate buffers,
> and then let this process send/recv to itself using the proper datatypes.
>
> Finally, if this communication is not a significant part of your
> runtime, you shouldn't spend much time optimizing it anyway.
>
> Joachim
>
>
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From eugen at leitl.org Thu Mar 3 08:42:20 2005
From: eugen at leitl.org (Eugen Leitl)
Date: Thu, 3 Mar 2005 14:42:20 +0100
Subject: [Beowulf] purchasing sources for Newisys 2100 (and 4300)?
Message-ID: <20050303134219.GB13336@leitl.org>
Question to resident hardware purchasers: where do you get your Newisys systems,
in the EU (Germany, especially)? Small quantities.
The company I work for has good prices for Sun V20z iron, but naturally I'm
looking for better deals, especially with large memory configurations.
I know this is the wrong place to ask, but I can't find a lead on the web.
Thanks,
--
Eugen* Leitl leitl
______________________________________________________________
ICBM: 48.07078, 11.61144 http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE
http://moleculardevices.org http://nanomachines.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL:
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From gropp at mcs.anl.gov Thu Mar 3 09:23:10 2005
From: gropp at mcs.anl.gov (William Gropp)
Date: Thu, 03 Mar 2005 08:23:10 -0600
Subject: [Beowulf] MPI programming question: Interleaved
MPI_Gatherv?
In-Reply-To: <1109659454.6544.2.camel@localhost.localdomain>
References: <1109659454.6544.2.camel@localhost.localdomain>
Message-ID: <6.2.1.2.2.20050303081639.098a3c10@pop.mcs.anl.gov>
At 12:44 AM 3/1/2005, Michael Gauckler wrote:
>Dear List,
>
>I would like to gather the data from several processes.
>Instead of the comonly used stride, I want to interleave
>the data:
>
>Rank 0: AAAAA -> ABCDABCDABCDABCDABCD
>Rank 1: BBBBB ----^---^---^---^---^
>Rank 2: CCCCC -----^---^---^---^---^
>Rank 3: DDDDD ------^---^---^---^---^
>
>Since the stride of the receive type is indicated
>in multpiles of its mpi_type, no interleaving is
>possible (the smallest striping factor leads to
>AAAAABBBBBBCCCCCDDDDD).
>
>Is there a way to achieve this behaviour in an
>elegant way, as MPI_Gather promises it? Or do
>I need to do Send/Recv with self-aligned offsets?
You should be able to do this with MPI_Gather by creating a new datatype on
the receiving process whose extent is the size of a single item; that will
get you the correct offset for the first element. In order to receive the
subsequent elements into the desired location, you need to use a vector
type containing the number of elements. And for this to be fast, you need
an MPI implementation that will handle the "resized" datatype efficiently
(use MPI_Type_vector to create the full datatype and
MPI_Type_create_resized to change its effective extent). If you are moving
large amounts of data, separate send/recvs are probably a better choice.
Bill
>Thank you for your help!
>
> Michael
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit
>http://www.beowulf.org/mailman/listinfo/beowulf
William Gropp
http://www.mcs.anl.gov/~gropp
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rross at mcs.anl.gov Thu Mar 3 11:35:37 2005
From: rross at mcs.anl.gov (Rob Ross)
Date: Thu, 03 Mar 2005 10:35:37 -0600
Subject: [Beowulf] MPI programming question: Interleaved MPI_Gatherv?
In-Reply-To:
References: <1109659454.6544.2.camel@localhost.localdomain> <4226D580.2010206@ccrl-nece.de>
Message-ID: <42273CD9.5030503@mcs.anl.gov>
Robert G. Brown wrote:
> On Thu, 3 Mar 2005, Joachim Worringen wrote:
>
>>Michael Gauckler wrote:
>>
>>>I would like to gather the data from several processes.
>>>Instead of the comonly used stride, I want to interleave
>>>the data:
>>>
>>>Rank 0: AAAAA -> ABCDABCDABCDABCDABCD
>>>Rank 1: BBBBB ----^---^---^---^---^
>>>Rank 2: CCCCC -----^---^---^---^---^
>>>Rank 3: DDDDD ------^---^---^---^---^
>>>
>>>Since the stride of the receive type is indicated
>>>in multpiles of its mpi_type, no interleaving is
>>>possible (the smallest striping factor leads to
>>>AAAAABBBBBBCCCCCDDDDD).
>>>
>>>Is there a way to achieve this behaviour in an
>>>elegant way, as MPI_Gather promises it? Or do
>>>I need to do Send/Recv with self-aligned offsets?
>
>
> What about RMA-like commands? MPI_Get in a loop? Since that is
> controlled by the gatherer, one would presume that it preserves call
> order (although it is non-blocking).
I would hope that one would read the spec instead! MPI_Get()s don't
necessarily *do* anything until the corresponding synchronization call.
This allows the implementation to aggregate messages. Call order (of
the MPI_Get()s in an epoch) is ignored.
> Or of course there are always raw sockets... where you have complete
> control. Depending on how critical it is that you preserve this strict
> interleaving order.
>
> rgb
No you don't! You're just letting the kernel buffer things instead of
the MPI implementation. Plus, Michael's original concern was doing this
in an elegant way, not explicitly controlling the ordering.
Joachim had some good options for MPI.
Regards,
Rob
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rross at mcs.anl.gov Thu Mar 3 11:36:48 2005
From: rross at mcs.anl.gov (Rob Ross)
Date: Thu, 03 Mar 2005 10:36:48 -0600
Subject: [Beowulf] MPI programming question: Interleaved MPI_Gatherv?
In-Reply-To: <6.2.1.2.2.20050303081639.098a3c10@pop.mcs.anl.gov>
References: <1109659454.6544.2.camel@localhost.localdomain>
<6.2.1.2.2.20050303081639.098a3c10@pop.mcs.anl.gov>
Message-ID: <42273D20.2000702@mcs.anl.gov>
William Gropp wrote:
> At 12:44 AM 3/1/2005, Michael Gauckler wrote:
>
>> Dear List,
>>
>> I would like to gather the data from several processes.
>> Instead of the comonly used stride, I want to interleave
>> the data:
>>
>> Rank 0: AAAAA -> ABCDABCDABCDABCDABCD
>> Rank 1: BBBBB ----^---^---^---^---^
>> Rank 2: CCCCC -----^---^---^---^---^
>> Rank 3: DDDDD ------^---^---^---^---^
>>
>> Since the stride of the receive type is indicated
>> in multpiles of its mpi_type, no interleaving is
>> possible (the smallest striping factor leads to
>> AAAAABBBBBBCCCCCDDDDD).
>>
>> Is there a way to achieve this behaviour in an
>> elegant way, as MPI_Gather promises it? Or do
>> I need to do Send/Recv with self-aligned offsets?
>
>
> You should be able to do this with MPI_Gather by creating a new datatype
> on the receiving process whose extent is the size of a single item; that
> will get you the correct offset for the first element. In order to
> receive the subsequent elements into the desired location, you need to
> use a vector type containing the number of elements. And for this to be
> fast, you need an MPI implementation that will handle the "resized"
> datatype efficiently (use MPI_Type_vector to create the full datatype
> and MPI_Type_create_resized to change its effective extent). If you are
> moving large amounts of data, separate send/recvs are probably a better
> choice.
>
> Bill
Nice!
Rob
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From joachim at ccrl-nece.de Thu Mar 3 11:47:23 2005
From: joachim at ccrl-nece.de (Joachim Worringen)
Date: Thu, 03 Mar 2005 17:47:23 +0100
Subject: [Beowulf] MPI programming question: Interleaved MPI_Gatherv?
In-Reply-To: <6.2.1.2.2.20050303081639.098a3c10@pop.mcs.anl.gov>
References: <1109659454.6544.2.camel@localhost.localdomain>
<6.2.1.2.2.20050303081639.098a3c10@pop.mcs.anl.gov>
Message-ID: <42273F9B.4040506@ccrl-nece.de>
William Gropp wrote:
> You should be able to do this with MPI_Gather by creating a new datatype
> on the receiving process whose extent is the size of a single item; that
> will get you the correct offset for the first element. In order to
> receive the subsequent elements into the desired location, you need to
> use a vector type containing the number of elements. And for this to be
> fast, you need an MPI implementation that will handle the "resized"
> datatype efficiently (use MPI_Type_vector to create the full datatype
> and MPI_Type_create_resized to change its effective extent). If you are
> moving large amounts of data, separate send/recvs are probably a better
> choice.
Oh yes, I forgot, twiddling with LB and UB. I never liked this, esp. as
an MPI implementor. Not especially 'elegant', but it should work. Good
conformance test, BTW.
Joachim
--
Joachim Worringen - NEC C&C research lab St.Augustin
fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rgb at phy.duke.edu Thu Mar 3 13:35:08 2005
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Thu, 3 Mar 2005 13:35:08 -0500 (EST)
Subject: [Beowulf] MPI programming question: Interleaved MPI_Gatherv?
In-Reply-To: <42273CD9.5030503@mcs.anl.gov>
References: <1109659454.6544.2.camel@localhost.localdomain>
<4226D580.2010206@ccrl-nece.de>
<42273CD9.5030503@mcs.anl.gov>
Message-ID:
On Thu, 3 Mar 2005, Rob Ross wrote:
> > What about RMA-like commands? MPI_Get in a loop? Since that is
> > controlled by the gatherer, one would presume that it preserves call
> > order (although it is non-blocking).
>
> I would hope that one would read the spec instead! MPI_Get()s don't
> necessarily *do* anything until the corresponding synchronization call.
> This allows the implementation to aggregate messages. Call order (of
> the MPI_Get()s in an epoch) is ignored.
Ouch! I did read the spec, about ten seconds before replying, and note
that I SAID it was non-blocking (from the spec) and was thinking about
using it with sync's. However, yes, this is a bit oxymoronic on a
reread (preserve call order vs non-blocking? Jeeze:-) and I consider
myself whomped upside the head:-)
Still, what IS to prevent him from alternating gets and synchronization
calls while retrieving . I don't think there is any alternative to
doing this in ANY non-blocking, potentially aggregating or parallel
communications scenario, although where he puts the barrier might vary.
Depending on whether he cares about the (ABCD)(ABCD)... order per se he
might try (however he does the "get"ting or receiving, or whatever):
get A (sync) get B (sync) get C (sync)
(inefficient but absolutely guarantees loop order).
If (ABCD)(BADC)(DCAB)... is ok (he doesn't care what order they arrive
in but he doesn't want the A communications to be aggregated so that two
A's get there before he gets the BCD from the same cycle of
computations) then he should be able to do:
get A get B get C get D (sync) get A get B get C...
If I understand things correctly (where there is a very definite chance
that I don't!) these (on top of any library) will in the first case be
equivalent to a blocking TCP read from A, B, C... in order (but probably
not as efficient as TCP would be in this particular case because MPI is
optimized against the near diametrically opposite assumption) while the
second would be equivalent to using select to monitor a group of open
sockets for I/O, reading from them in the order that data becomes
available, but adding a toggle so you don't read from one twice before
reading from all of them. Although there is likely more than one way to
do it, and where in low-level programming one might well want to
implement handshaking of some sort to trigger the next cycle's send
(blocking the remote client's execution as necessary) to avoid
overrunning buffers or exhausting memory on the master/aggregator if for
any reason one host turns out to be "slow" relative to the others. In
MPI one hopes all of that is handled for you, and more.
> > Or of course there are always raw sockets... where you have complete
> > control. Depending on how critical it is that you preserve this strict
> > interleaving order.
> >
> > rgb
>
> No you don't! You're just letting the kernel buffer things instead of
> the MPI implementation. Plus, Michael's original concern was doing this
> in an elegant way, not explicitly controlling the ordering.
Of course one has maximum control with raw sockets (or more generically,
raw devices). Somewhere down inside MPI there ARE raw sockets (or
equivalent at some level in the ISO/OSI stack). The MPI library wraps
them up and hides all sorts of details while making all the device
features uniform across the now "invisible" devices and in the process
necessarily excluding one from ACCESSING all of those features a.k.a.
the details being hidden. I may have misunderstood the recent
discussion about the possible need for an MPI ABI but I thought that was
what it was all about -- providing a measure of low level control that
is BY DESIGN hidden by the API but is there, should one every try to
code for it, in the actual kernel/device interface (e.g. regulated by
ioctls).
Note that at this low (device driver) level I would expect the kernel to
handle at least some asynchronous low-level buffering and the primary
interrupt processing for the physical device layer FOR the MPI
implementation or any other program that uses the device -- you cannot
safely escape this. This does not mean that you cannot control just
where you stop using the kernel and rely on your own intermediary layers
for handing the device above the level of raw read/write plus ioctls.
That is the application layer or higher order kernel networking layers
(depending on just where and how you access the device itself) may well
manage buffers of their own, reliability, retransmission, RDMA,
blocking/non-blocking of the interface, timeouts, and more. Low level
networking is not easy, which is WHY people wrote PVM and the MPI
network devices.
So ultimately, all I was observing is that it is pretty straightforward
(not necessarily easy, but straightforward) to write an application that
very definitely and without any question goes and gets a chunk of data
(say, contents of a struct) from an open socket on system A and puts it
in a memory location (with appropriate pointers and sizeof and so forth
for the struct), THEN gets a chunk of data from an open socket on system
B and puts it in the next memory location, THEN gets a chunk of data
from an open socket on system C and puts it ...
In fact, since TCP generally does block on a read until there is data on
the socket, it is relatively difficult to do it any other way in a
simple loop over sockets -- you have to use select as noted above to
avoid polling and non-blocking I/O, and in all cases one has to be
pretty careful not to drop things and to handle cases where a stream
runs slow or does other bad things. As I've learned the hard way.
As far as elegance is concerned:
a) That's a bit in the eye of the beholder. There are tradeoffs
between simplicity of code and ease of development work vs performance
and control but it is hard to say which is more "elegant". It's fairer
to make the value-neutral statement that you have to work much harder to
write a parallel application on top of raw sockets (no question at
all;-) but have all the control and optimization potential available to
userspace (at the ioctl level above the kernel and device driver itself)
if you do so.
To cite a metaphorical situation, is coding in assembler, whether one is
coding a complete application or an embedded optimized operation,
"inelegant"? Perhaps, but that's not the word I would have used. There
are times when assembler is very elegant, in the sense that it directly
encodes an algorithm with the greatest degree of optimization and
control where a compiler might well generate indifferent code or fail to
use all the features of the hardware. Once upon a time many many years
ago I handed coded e.g. trig functions and arithmetic for the 8087 in
assembler because my compiler generated 8088 code that ran about ten
times more slowly. Elegant or inelegant?
Compare to just this situation -- if for some reason you require e.g.
absolute control over the order of utilization of your network links in
a parallel computation (perhaps to avoid collisions or device/line
contention, to do something exotic with transmission order and pattern
on a hypercubical network) you may well find that MPI or PVM simply do
not provide that degree of control, period. They try to "do the right
thing" for a generic class of problem and simple assumptions as to the
kind and number of interfaces and routes between nodes and load patterns
along those routes, BUT there is no >>guarantee<< that the right thing
they end up with (often chosen for robustness and optimization for the
most common classes of problems) will be right for YOUR problem and
network and no way to tweak it if it is not. In that case, using raw
network devices (whatever they might be) might well be the only way to
achieve the requisite level of control and yes, might be worth a factor
of 10 in speed.
I'll bet money that if you polled the list, you'd find that there exist
people who have gone in and hacked MPI at the source level to "break" it
(de-optimize it for the most common applications so they run worse) or
who have run over time several versions of MPI including "new and
improved" ones, who have found empirically that there are applications
for which the hacked/older "disimproved" versions perform better.
b) Anyway, this explains why I mentioned raw sockets at he end.
Note well the "Depending on how..." Maybe I read the original message
incorrectly, but I thought that the issue was that (for reasons unknown)
he wanted to guarantee collection in the order A then B then C... Why
he wanted to do this wasn't clear, nor was it clear whether (in any
given cycle) it would be ok to do A then C then B then... (and just not
overlap the next A). If the strict interleaving wasn't an issue, than I
would have thought just putting a barrier at the end of an ABC...Z cycle
would have forced all communications to complete before starting the
next cycle.
So IF this really IS a critical requirement -- he has to read from A,
complete the read blocking no fooling, move on and read from B, etc, no
data parallelism or asynchronicity permitted that might violate this
strict order (or if he's interleaving communications on four different
network devices along different routes to different sub-clusters of
nodes), then doing it within MPI might or might not be efficient. Raw
TCP sockets (or lower level hardware-dependent I/O) would a PITA to
code, but you can pretty much guarantee that the resulting code is as
efficient as possible, given the requirement, and it might be the ONLY
way to accomplish a complicated interleave of node I/O for some very
specific set of reasons. If you do the considerable work required to
make it so, of course, copy of the complete works of Stevens in
hand...;-)
> Joachim had some good options for MPI.
I agree. I don't even disagree with what you say above -- I understand
what you mean. I just think that we need more data before concluding
that those options were enough. He described his design goal but not
his motivation. For some design goals there are probably lots of good
ways to do it in MPI.
rgb
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rsweet at aoes.com Thu Mar 3 03:52:32 2005
From: rsweet at aoes.com (Ryan Sweet)
Date: Thu, 3 Mar 2005 09:52:32 +0100 (CET)
Subject: [Beowulf] So we will write our own book - next steps...
In-Reply-To: <20050302002104.EBYF8920.swebmail00.mail.ozemail.net@localhost>
References: <20050302002104.EBYF8920.swebmail00.mail.ozemail.net@localhost>
Message-ID:
The response to this thread has been great!
I am keeping track of all the responses, and will try to present some sort of
overview.
I have a system ready for hosting. What I'd like to do is to review a few
different wikis/collaboration systems/etc. to check on some issues such as
their security risks, ease of installation/maintenance, printing/offline
reading support, and so on.
If you have a preference please send me suggestions for consideration
off-list.
Either I'll setup up a few of the best ones and then we can choose among them,
or if I don't have time I'll just setup the one I think is the best and if
people who are actually contributing don't like it we can discuss changing it
at that time. I think I will have something by Tuesday, though maybe earlier.
If you are thinking of writing something, please go have a look at the (older
but weathering well) FAQ on beowulf.org and then at Robert Brown's book first
and to avoid re-inventing the wheel, update, acknowledge, and borrow.
regards,
-Ryan
--
Ryan Sweet
Advanced Operations and Engineering Services
AOES Group BV http://www.aoes.com
Phone +31(0)71 5795521 Fax +31(0)71572 1277
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rgb at phy.duke.edu Thu Mar 3 13:40:07 2005
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Thu, 3 Mar 2005 13:40:07 -0500 (EST)
Subject: [Beowulf] MPI programming question: Interleaved MPI_Gatherv?
In-Reply-To: <42273D20.2000702@mcs.anl.gov>
References: <1109659454.6544.2.camel@localhost.localdomain>
<6.2.1.2.2.20050303081639.098a3c10@pop.mcs.anl.gov>
<42273D20.2000702@mcs.anl.gov>
Message-ID:
On Thu, 3 Mar 2005, Rob Ross wrote:
OK, having re-reread everything, I conclude that you were completely
right after all. I misunderstood what his question was. I'm still not
certain that I understand, but if Bill has answered it it definitely
isn't what I though.
So double-whomp. I'll go sleep now.
rgb
>
>
> William Gropp wrote:
> > At 12:44 AM 3/1/2005, Michael Gauckler wrote:
> >
> >> Dear List,
> >>
> >> I would like to gather the data from several processes.
> >> Instead of the comonly used stride, I want to interleave
> >> the data:
> >>
> >> Rank 0: AAAAA -> ABCDABCDABCDABCDABCD
> >> Rank 1: BBBBB ----^---^---^---^---^
> >> Rank 2: CCCCC -----^---^---^---^---^
> >> Rank 3: DDDDD ------^---^---^---^---^
> >>
> >> Since the stride of the receive type is indicated
> >> in multpiles of its mpi_type, no interleaving is
> >> possible (the smallest striping factor leads to
> >> AAAAABBBBBBCCCCCDDDDD).
> >>
> >> Is there a way to achieve this behaviour in an
> >> elegant way, as MPI_Gather promises it? Or do
> >> I need to do Send/Recv with self-aligned offsets?
> >
> >
> > You should be able to do this with MPI_Gather by creating a new datatype
> > on the receiving process whose extent is the size of a single item; that
> > will get you the correct offset for the first element. In order to
> > receive the subsequent elements into the desired location, you need to
> > use a vector type containing the number of elements. And for this to be
> > fast, you need an MPI implementation that will handle the "resized"
> > datatype efficiently (use MPI_Type_vector to create the full datatype
> > and MPI_Type_create_resized to change its effective extent). If you are
> > moving large amounts of data, separate send/recvs are probably a better
> > choice.
> >
> > Bill
>
> Nice!
>
> Rob
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From lindahl at pathscale.com Thu Mar 3 13:55:16 2005
From: lindahl at pathscale.com (Greg Lindahl)
Date: Thu, 3 Mar 2005 10:55:16 -0800
Subject: [Beowulf] 2.6.11 is out; with InfiBand support
In-Reply-To:
References:
<20050303020600.GA56437@piskorski.com>
Message-ID: <20050303185516.GB1453@greglaptop.internal.keyresearch.com>
On Thu, Mar 03, 2005 at 08:58:33AM -0600, Don Holmgren wrote:
> I was just quoted quantity 2 PCI-X HCA's (4x 2 port Mellanox) by a
> reseller for $655 each. Last Fall we purchased a large quantity of
> PCI-E HCA's for considerably less than that unit price.
>
> Supposedly the "memory free" PCI-E HCA's that use host memory, rather
> than on-board sram, should move prices towards $100 sometime this
> year - we'll see (landed on motherboard pricing at ~ $70, see
> http://www.mellanox.com/news/press/pr_030105.html)
You're mixing retail price with wholesale price. The $69 price is
apparently quantity 10,000, for just the chip, and the card that
is inexpensive will be PCIe 4X, which hurts performance.
I hear that today's street price for Mellanox-based cards is ~$600 in
cluster-sized quantities, which matches what you report.
> Like other high performance network gear, it's tough to get
> accurate pricing information without going out and getting quotes.
One nice thing about Myricom is that their prices have always been on
the web -- all you need to know in addition is your discount.
-- greg
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rross at mcs.anl.gov Thu Mar 3 14:17:50 2005
From: rross at mcs.anl.gov (Rob Ross)
Date: Thu, 03 Mar 2005 13:17:50 -0600
Subject: [Beowulf] MPI programming question: Interleaved MPI_Gatherv?
In-Reply-To:
References: <1109659454.6544.2.camel@localhost.localdomain>
<6.2.1.2.2.20050303081639.098a3c10@pop.mcs.anl.gov>
<42273D20.2000702@mcs.anl.gov>
Message-ID: <422762DE.1090209@mcs.anl.gov>
Sleep is good :).
Robert G. Brown wrote:
> On Thu, 3 Mar 2005, Rob Ross wrote:
>
> OK, having re-reread everything, I conclude that you were completely
> right after all. I misunderstood what his question was. I'm still not
> certain that I understand, but if Bill has answered it it definitely
> isn't what I though.
>
> So double-whomp. I'll go sleep now.
>
> rgb
>
>
>>
>>William Gropp wrote:
>>
>>>At 12:44 AM 3/1/2005, Michael Gauckler wrote:
>>>
>>>
>>>>Dear List,
>>>>
>>>>I would like to gather the data from several processes.
>>>>Instead of the comonly used stride, I want to interleave
>>>>the data:
>>>>
>>>>Rank 0: AAAAA -> ABCDABCDABCDABCDABCD
>>>>Rank 1: BBBBB ----^---^---^---^---^
>>>>Rank 2: CCCCC -----^---^---^---^---^
>>>>Rank 3: DDDDD ------^---^---^---^---^
>>>>
>>>>Since the stride of the receive type is indicated
>>>>in multpiles of its mpi_type, no interleaving is
>>>>possible (the smallest striping factor leads to
>>>>AAAAABBBBBBCCCCCDDDDD).
>>>>
>>>>Is there a way to achieve this behaviour in an
>>>>elegant way, as MPI_Gather promises it? Or do
>>>>I need to do Send/Recv with self-aligned offsets?
>>>
>>>
>>>You should be able to do this with MPI_Gather by creating a new datatype
>>>on the receiving process whose extent is the size of a single item; that
>>>will get you the correct offset for the first element. In order to
>>>receive the subsequent elements into the desired location, you need to
>>>use a vector type containing the number of elements. And for this to be
>>>fast, you need an MPI implementation that will handle the "resized"
>>>datatype efficiently (use MPI_Type_vector to create the full datatype
>>>and MPI_Type_create_resized to change its effective extent). If you are
>>>moving large amounts of data, separate send/recvs are probably a better
>>>choice.
>>>
>>>Bill
>>
>>Nice!
>>
>>Rob
>>_______________________________________________
>>Beowulf mailing list, Beowulf at beowulf.org
>>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>>
>
>
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rsweet at aoes.com Thu Mar 3 03:29:56 2005
From: rsweet at aoes.com (Ryan Sweet)
Date: Thu, 3 Mar 2005 09:29:56 +0100 (CET)
Subject: [Beowulf] GRID APPLICATION
In-Reply-To: <012e01c51e4e$ea4b6860$0f120897@PMORND>
References: <012e01c51e4e$ea4b6860$0f120897@PMORND>
Message-ID:
On Tue, 1 Mar 2005, Rajiv wrote:
> Dear All,
> I have setup Globus 3.2 on two machines and I am able to submit job from
> one machine to another. I have a basic doubt about what application to run
> in GRID environments. Shouldn't the GRID application use resources of both
> the GRID machines simultaniously. Are there any applications like this. So
> far I am only running remote jobs from on machine to another - for eg I can
> submit
> and run LINPACK/GROMACS job from one master of a cluster to a master of
> another cluster.
Dear Rajiv,
There seem to be a lot of people building clusters and grid systems lately
without applications to run on them ;-). That's nice to see, I guess, in as
much as it indicates the broad level of acceptance for the technologies. It
is very much the reverse of the "to scratch an itch" way in which things used
to be done. ;-)
Grid systems (the term means many things to many people - here I mean roughly
"a collection of resources used in collaboration spanning multiple
geographic locations and administrative domains") are used in a wide variety
of ways. In your scenario, if you are building a globus system in order to
learn about globus, and you can now run jobs on one host from another and
vice-versa, you've already got a lot of the hard work done.
If you would like to use multiple grid resources to simultaneously work on a
larger problem than any of them can tackle when working alone, then you need a
way to take your problem and partition it into chunks that can be submitted to
various resources around the grid, in your example, split a larger problem in
two, and submit half to each resource. It is not usually practical (though
there are exceptions) to run jobs which have a parallel communication
component (MPI or PVM) across grid resoures (submitting multiple local mpi
jobs to multiple resources is ok though, provided that you have a way to
verify that the resources can accept your mpi jobs and run them - thats where
RSL and a broker, etc. come in). Some middleware to broker between the
requirements of your job and the available resources is usually used. There
are a lot of projects that do that, and any attempt I would make to list them
would surely leave out one or more deserving ones. Google is your friend.
For GROMACS there are lots of examples out there. Here is a very friendly
one from the UK NGS: http://www.ngs.ac.uk/sites/ox/software/gromacs.html
regards,
-Ryan
--
Ryan Sweet
Advanced Operations and Engineering Services
AOES Group BV http://www.aoes.com
Phone +31(0)71 5795521 Fax +31(0)71572 1277
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rross at mcs.anl.gov Thu Mar 3 15:50:24 2005
From: rross at mcs.anl.gov (Rob Ross)
Date: Thu, 03 Mar 2005 14:50:24 -0600
Subject: [Beowulf] MPI programming question: Interleaved MPI_Gatherv?
In-Reply-To: <42273F9B.4040506@ccrl-nece.de>
References: <1109659454.6544.2.camel@localhost.localdomain> <6.2.1.2.2.20050303081639.098a3c10@pop.mcs.anl.gov>
<42273F9B.4040506@ccrl-nece.de>
Message-ID: <42277890.5070509@mcs.anl.gov>
Joachim Worringen wrote:
> William Gropp wrote:
>
>> You should be able to do this with MPI_Gather by creating a new
>> datatype on the receiving process whose extent is the size of a single
>> item; that will get you the correct offset for the first element. In
>> order to receive the subsequent elements into the desired location,
>> you need to use a vector type containing the number of elements. And
>> for this to be fast, you need an MPI implementation that will handle
>> the "resized" datatype efficiently (use MPI_Type_vector to create the
>> full datatype and MPI_Type_create_resized to change its effective
>> extent). If you are moving large amounts of data, separate send/recvs
>> are probably a better choice.
>
> Oh yes, I forgot, twiddling with LB and UB. I never liked this, esp. as
> an MPI implementor. Not especially 'elegant', but it should work. Good
> conformance test, BTW.
It needs to have a negative extent to really test things. The positive
extents are easy!
Rob
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From list-beowulf at onerussian.com Thu Mar 3 11:33:59 2005
From: list-beowulf at onerussian.com (Yaroslav Halchenko)
Date: Thu, 3 Mar 2005 11:33:59 -0500
Subject: [Beowulf] DCC (debian cluster components)
Message-ID: <20050303163359.GJ4482@washoe.rutgers.edu>
Dear Debianized beowulfers or beowulfiezed Debian users,
Does any one has experience with
http://www.irb.hr/en/cir/projects/dcc/
which recently was released?
Project Goals
We expect to integrate some existing technologies (like LDAP, System
Installation Suite, Torque, C3...) and develop a production-grade
toolset for easier cluster management, based on Debian GNU/Linux
distribution. This involves development of automation mechanisms that
provide a flexible platform for high-performance computation tasks, but
also provide a system-administrator to have a secure, easy to maintain,
reliable and good supported cluster administration toolbox, based on
Debian/GNU Linux.
--
.-.
=------------------------------ /v\ ----------------------------=
Keep in touch // \\ (yoh@|www.)onerussian.com
Yaroslav Halchenko /( )\ ICQ#: 60653192
Linux User ^^-^^ [175555]
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL:
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From djholm at fnal.gov Thu Mar 3 09:58:33 2005
From: djholm at fnal.gov (Don Holmgren)
Date: Thu, 03 Mar 2005 08:58:33 -0600
Subject: [Beowulf] 2.6.11 is out; with InfiBand support
In-Reply-To: <20050303020600.GA56437@piskorski.com>
References:
<20050303020600.GA56437@piskorski.com>
Message-ID:
I was just quoted quantity 2 PCI-X HCA's (4x 2 port Mellanox) by a
reseller for $655 each. Last Fall we purchased a large quantity of
PCI-E HCA's for considerably less than that unit price.
Supposedly the "memory free" PCI-E HCA's that use host memory, rather
than on-board sram, should move prices towards $100 sometime this
year - we'll see (landed on motherboard pricing at ~ $70, see
http://www.mellanox.com/news/press/pr_030105.html)
Like other high performance network gear, it's tough to get
accurate pricing information without going out and getting quotes.
Don Holmgren
On Wed, 2 Mar 2005, Andrew Piskorski wrote:
> On Wed, Mar 02, 2005 at 06:09:09PM -0500, Mark Hahn wrote:
> > > Arima and Iwill have mobos with IB LOM (Landed on Motherboard).
> >
> > given the choice between a $150 pcie IB nic and having it onboard,
> > I'd choose the separate card. I know the IB salesdroids always
>
> Except, a single PCI-X Infiniband card currently costs $1000 or so,
> right? (That's for a 4x 2 port card, but Froogle does not seem to
> know of any cheaper cards.)
>
> http://h30094.www3.hp.com/product.asp?sku=2603660&jumpid=ex_r2910_frooglesmb/accessories
> http://www.costcentral.com/proddetail/HP_NC570C/376158B21/F35425/froogle/
>
> --
> Andrew Piskorski
> http://www.piskorski.com/
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From hahn at physics.mcmaster.ca Thu Mar 3 20:24:06 2005
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Thu, 3 Mar 2005 20:24:06 -0500 (EST)
Subject: [Beowulf] 2.6.11 is out; with InfiBand support
In-Reply-To:
Message-ID:
> Supposedly the "memory free" PCI-E HCA's that use host memory, rather
> than on-board sram, should move prices towards $100 sometime this
> year - we'll see (landed on motherboard pricing at ~ $70, see
> http://www.mellanox.com/news/press/pr_030105.html)
the IB world (still consting only of Mellanox chips, right?)
has done a good job pushing down adapter prices.
can anyone comment on trends in switch pricing?
thanks, mark hahn.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From john.hearns at streamline-computing.com Fri Mar 4 04:33:18 2005
From: john.hearns at streamline-computing.com (John Hearns)
Date: Fri, 04 Mar 2005 09:33:18 +0000
Subject: [Beowulf] DCC (debian cluster components)
In-Reply-To: <20050303163359.GJ4482@washoe.rutgers.edu>
References: <20050303163359.GJ4482@washoe.rutgers.edu>
Message-ID: <1109928798.5537.22.camel@Vigor45>
On Thu, 2005-03-03 at 11:33 -0500, Yaroslav Halchenko wrote:
> Dear Debianized beowulfers or beowulfiezed Debian users,
>
> Does any one has experience with
> http://www.irb.hr/en/cir/projects/dcc/
> which recently was released?
>
> Project Goals
>
> We expect to integrate some existing technologies (like LDAP, System
> Installation Suite,
Seems strange that they haven't chosen FAI (Fully Automated Installer).
As an aside, there was a poster about FAI up outside the cluster
developer's room at FOSDEM last weekend.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rgb at phy.duke.edu Fri Mar 4 08:03:00 2005
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Fri, 4 Mar 2005 08:03:00 -0500 (EST)
Subject: [Beowulf] DCC (debian cluster components)
In-Reply-To: <1109928798.5537.22.camel@Vigor45>
References: <20050303163359.GJ4482@washoe.rutgers.edu>
<1109928798.5537.22.camel@Vigor45>
Message-ID:
On Fri, 4 Mar 2005, John Hearns wrote:
> On Thu, 2005-03-03 at 11:33 -0500, Yaroslav Halchenko wrote:
> > Dear Debianized beowulfers or beowulfiezed Debian users,
> >
> > Does any one has experience with
> > http://www.irb.hr/en/cir/projects/dcc/
> > which recently was released?
> >
> > Project Goals
> >
> > We expect to integrate some existing technologies (like LDAP, System
> > Installation Suite,
>
> Seems strange that they haven't chosen FAI (Fully Automated Installer).
> As an aside, there was a poster about FAI up outside the cluster
> developer's room at FOSDEM last weekend.
Is FAI being loved by somebody at this point? There was a time a few
years ago where it seemed to be lying fallow (although as always I could
be mistaken about that). Toolsets like that usually need a fairly
active and energetic human to care for them, if not several...
rgb
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From roger at erc.msstate.edu Fri Mar 4 09:10:28 2005
From: roger at erc.msstate.edu (Roger L. Smith)
Date: Fri, 4 Mar 2005 08:10:28 -0600 (CST)
Subject: [Beowulf] 2.6.11 is out; with InfiBand support
In-Reply-To:
References:
Message-ID:
On Thu, 3 Mar 2005, Mark Hahn wrote:
> the IB world (still consting only of Mellanox chips, right?)
> has done a good job pushing down adapter prices.
>
> can anyone comment on trends in switch pricing?
I know at least one vendor has a 24 port model using the newer IB chipset
for around $8,000.
_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_
| Roger L. Smith Phone: 662-325-3625 |
| Sr. Systems Administrator FAX: 662-325-7692 |
| roger at ERC.MsState.Edu http://WWW.ERC.MsState.Edu/~roger |
| Mississippi State University |
|____________________________________ERC__________________________________|
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From gmpc at sanger.ac.uk Fri Mar 4 09:40:01 2005
From: gmpc at sanger.ac.uk (Guy Coates)
Date: Fri, 4 Mar 2005 14:40:01 +0000 (GMT)
Subject: [Beowulf] DCC (debian cluster components)
In-Reply-To:
References: <20050303163359.GJ4482@washoe.rutgers.edu>
<1109928798.5537.22.camel@Vigor45>
Message-ID:
>
> Is FAI being loved by somebody at this point? There was a time a few
> years ago where it seemed to be lying fallow (although as always I could
> be mistaken about that). Toolsets like that usually need a fairly
> active and energetic human to care for them, if not several...
It is still alive. We're just in the process of rolling out a new
cluster with it at this very moment. Works fine.
Guy
--
Dr. Guy Coates, Informatics System Group
The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK
Tel: +44 (0)1223 834244 ex 7199
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From landman at scalableinformatics.com Fri Mar 4 11:04:37 2005
From: landman at scalableinformatics.com (Joe Landman)
Date: Fri, 04 Mar 2005 11:04:37 -0500
Subject: [Beowulf] 2.6.11 is out; with InfiBand support
In-Reply-To: <4228821F.90607@charter.net>
References:
<4228821F.90607@charter.net>
Message-ID: <42288715.10403@scalableinformatics.com>
8 ports under 8k or 24 ports under 8k?
Jeffrey B. Layton wrote:
> However, to match what Roger said, one IB vendor gave me
> a list price for 8-ports of IB for under $8,000.
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 734 786 8452
cell : +1 734 612 4615
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From laytonjb at charter.net Fri Mar 4 11:16:03 2005
From: laytonjb at charter.net (Jeffrey B. Layton)
Date: Fri, 04 Mar 2005 11:16:03 -0500
Subject: [Beowulf] 2.6.11 is out; with InfiBand support
In-Reply-To: <42288715.10403@scalableinformatics.com>
References:
<4228821F.90607@charter.net>
<42288715.10403@scalableinformatics.com>
Message-ID: <422889C3.5080902@charter.net>
8 ports under 8k, but it was a 24 port switch :)
This includes all of the HCA's, switches (only one),
cables, and software.
Jeff
> 8 ports under 8k or 24 ports under 8k?
>
> Jeffrey B. Layton wrote:
>
>> However, to match what Roger said, one IB vendor gave me
>> a list price for 8-ports of IB for under $8,000.
>
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From roger at erc.msstate.edu Fri Mar 4 11:34:32 2005
From: roger at erc.msstate.edu (Roger L. Smith)
Date: Fri, 4 Mar 2005 10:34:32 -0600 (CST)
Subject: [Beowulf] 2.6.11 is out; with InfiBand support
In-Reply-To: <422889C3.5080902@charter.net>
References:
<4228821F.90607@charter.net>
<42288715.10403@scalableinformatics.com> <422889C3.5080902@charter.net>
Message-ID:
THe price I stated was for a 24 port switch for around $8,000 list. As a
matter of fact, I just confirmed this with the vendor.
This does not include cables or HCAs.
On Fri, 4 Mar 2005, Jeffrey B. Layton wrote:
> 8 ports under 8k, but it was a 24 port switch :)
> This includes all of the HCA's, switches (only one),
> cables, and software.
>
> Jeff
>
> > 8 ports under 8k or 24 ports under 8k?
> >
> > Jeffrey B. Layton wrote:
> >
> >> However, to match what Roger said, one IB vendor gave me
> >> a list price for 8-ports of IB for under $8,000.
> >
>
_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_
| Roger L. Smith Phone: 662-325-3625 |
| Sr. Systems Administrator FAX: 662-325-7692 |
| roger at ERC.MsState.Edu http://WWW.ERC.MsState.Edu/~roger |
| Mississippi State University |
|____________________________________ERC__________________________________|
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From laytonjb at charter.net Fri Mar 4 10:43:27 2005
From: laytonjb at charter.net (Jeffrey B. Layton)
Date: Fri, 04 Mar 2005 10:43:27 -0500
Subject: [Beowulf] 2.6.11 is out; with InfiBand support
In-Reply-To:
References:
Message-ID: <4228821F.90607@charter.net>
Roger L. Smith wrote:
>On Thu, 3 Mar 2005, Mark Hahn wrote:
>
>
>
>>the IB world (still consting only of Mellanox chips, right?)
>>has done a good job pushing down adapter prices.
>>
>>can anyone comment on trends in switch pricing?
>>
>>
>
>I know at least one vendor has a 24 port model using the newer IB chipset
>for around $8,000.
>
>
>
I just finished an interconnect survey article for Doug and
ClusterWorld Magazine. As part of the article I have a nice
table with list prices for 8 nodes and 128 nodes for the
various interconnects. It should be out in the May issue
so be sure to look for it.
However, to match what Roger said, one IB vendor gave me
a list price for 8-ports of IB for under $8,000.
Jeff
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From eugen at leitl.org Fri Mar 4 11:49:17 2005
From: eugen at leitl.org (Eugen Leitl)
Date: Fri, 4 Mar 2005 17:49:17 +0100
Subject: [Beowulf] Re: OS X in a Classified Environment... (fwd from
kstaats@terrasoftsolutions.com)
Message-ID: <20050304164917.GR13336@leitl.org>
----- Forwarded message from Kai Staats -----
From: Kai Staats
Date: Fri, 4 Mar 2005 09:22:42 -0700
To: scitech at lists.apple.com
Subject: Re: OS X in a Classified Environment...
Organization: Terra Soft Solutions, Inc.
User-Agent: KMail/1.7
Reply-To: kstaats at terrasoftsolutions.com
Bryan,
[snip]
> Army contractor for aerodynamics work. I would be interested to find
> out what happened to the Navy sonar cluster compute project that used
> G4 servers running Linux...
The original 272 G4 Xserves implemented in 2003 continue to be in use on-board
the subs (from what I understand). In addition, the TI04 project (this past
summer) invoked the use of G5 Xserves running our 64-bit Linux OS, Y-HPC. If
PowerPC continues to be used in the sonar imaging environment, Linux will
continue to be the preferred OS due to its flexibility and ease of code
migration to/from non-PowerPC systems that remain a part of the on-board
imaging systems.
More info here:
http://www.terrasoftsolutions.com/realworld/showcase/dod/
... with several other DoE/DoD customers:
http://www.terrasoftsolutions.com/products/y-hpc/customers.shtml
kai
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Scitech mailing list (Scitech at lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/scitech/eugen%40leitl.org
This email sent to eugen at leitl.org
----- End forwarded message -----
--
Eugen* Leitl leitl
______________________________________________________________
ICBM: 48.07078, 11.61144 http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE
http://moleculardevices.org http://nanomachines.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL:
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From jrajiv at hclinsys.com Fri Mar 4 01:51:09 2005
From: jrajiv at hclinsys.com (Rajiv)
Date: Fri, 4 Mar 2005 12:21:09 +0530
Subject: [Beowulf] HA OSCAR for loadbalancing and failover
Message-ID: <006d01c52086$8c50a8d0$0f120897@PMORND>
Dear Sir,
i am carrying outloadbalancing using OSCAR-3.0.We are also carrying out
failover using HA-OSCAR 1.0 beta release (High Availability OSCAR). We are
required to acheive loadbalancing
and failover for the following services:
1. HTTP.
2. FTP.
3. TELNET.
4. DHCP.
5. SQUID.
Our setup is as follows:
1 Primary server , 1 client node (using OSCAR-3.0)
1 standby server (using HA OSCAR)
We have succeeded in building the cluster but am
having problems regarding loadbalancing.We are trying to achieve
loadbalancing using the PBS Server(Portable Batch System) which comes
inbuilt with OSCAR-3.0.We are queing the services as jobs and trying to
distribute these jobs between the server and client node. But the problem
we are facing is that we are not able to submit the job to the PBS server.
Sir,firstly, we would like you to confirm if we are going on the right
track for achieving loadbalancing.We would like to know how you'll have
achieved load balancing?
Regards,
Rajiv
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From lists at subnetz.org Fri Mar 4 08:43:27 2005
From: lists at subnetz.org (Tilman Koschnick)
Date: Fri, 04 Mar 2005 14:43:27 +0100
Subject: [Beowulf] DCC (debian cluster components)
In-Reply-To:
References: <20050303163359.GJ4482@washoe.rutgers.edu>
<1109928798.5537.22.camel@Vigor45>
Message-ID: <1109943807.2081.51.camel@mother.subnetz.org>
On Fri, 2005-03-04 at 08:03 -0500, Robert G. Brown wrote:
> Is FAI being loved by somebody at this point? There was a time a few
> years ago where it seemed to be lying fallow (although as always I could
> be mistaken about that). Toolsets like that usually need a fairly
> active and energetic human to care for them, if not several...
I think it is. The latest (fairly long) changelog entry - version 2.6.6
- dates from 21 Jan 2005. I went to a talk by the FAI maintainer a
couple of months ago, and he didn't give the impression he was going to
abandon it any time soon. There was talk about porting FAI to
Redhat/rpm, but I don't know what the state of this is.
Cheers, Til
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From list-beowulf at onerussian.com Fri Mar 4 09:22:08 2005
From: list-beowulf at onerussian.com (Yaroslav Halchenko)
Date: Fri, 4 Mar 2005 09:22:08 -0500
Subject: [Beowulf] DCC (debian cluster components)
In-Reply-To:
References: <20050303163359.GJ4482@washoe.rutgers.edu>
<1109928798.5537.22.camel@Vigor45>
Message-ID: <20050304142208.GC32176@washoe.rutgers.edu>
On Fri, Mar 04, 2005 at 08:03:00AM -0500, Robert G. Brown wrote:
> Is FAI being loved by somebody at this point? There was a time a few
> years ago where it seemed to be lying fallow (although as always I could
> be mistaken about that). Toolsets like that usually need a fairly
> active and energetic human to care for them, if not several...
I like FAI and although it is just a set of scripts, it seems to be
stable. I've used it 1.5 years ago for the first time to install first
10 nodes on the cluster. I had to tweak it to make it work but then
whenever we've got 15 more nodes 5 month ago, my old FAI configuration
required just few adjustments to make its job.
DCC seems to use the idea of "image-based installation model" opposed to
FAI which is flexible cloned installation. For uniform clusters
image-based installation probably is better than FAI which admits
different classes of the configuration, thus is more flexible.
--
.-.
=------------------------------ /v\ ----------------------------=
Keep in touch // \\ (yoh@|www.)onerussian.com
Yaroslav Halchenko /( )\ ICQ#: 60653192
Linux User ^^-^^ [175555]
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL:
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From tallpaul at speakeasy.org Fri Mar 4 15:43:12 2005
From: tallpaul at speakeasy.org (Paul English)
Date: Fri, 4 Mar 2005 12:43:12 -0800 (PST)
Subject: [Beowulf] DCC (debian cluster components)
In-Reply-To:
References: <20050303163359.GJ4482@washoe.rutgers.edu>
<1109928798.5537.22.camel@Vigor45>
Message-ID:
On Fri, 4 Mar 2005, Robert G. Brown wrote:
> On Fri, 4 Mar 2005, John Hearns wrote:
>
> > On Thu, 2005-03-03 at 11:33 -0500, Yaroslav Halchenko wrote:
> > > Dear Debianized beowulfers or beowulfiezed Debian users,
> > >
> > > Does any one has experience with
> > > http://www.irb.hr/en/cir/projects/dcc/
> > > which recently was released?
> > >
> > > Project Goals
> > >
> > > We expect to integrate some existing technologies (like LDAP, System
> > > Installation Suite,
> >
> > Seems strange that they haven't chosen FAI (Fully Automated Installer).
> > As an aside, there was a poster about FAI up outside the cluster
> > developer's room at FOSDEM last weekend.
>
> Is FAI being loved by somebody at this point? There was a time a few
> years ago where it seemed to be lying fallow (although as always I could
> be mistaken about that). Toolsets like that usually need a fairly
> active and energetic human to care for them, if not several...
The list is alive, and has posts, etc. I did not have a great deal of luck
getting help with my questions and the process is (if anything) more raw
than Kickstart. I gave it a good try for several months because many of
our machines are debian, but in the end I gave up.
For clustering purposes, ROCKS has been much more useful - quick, useful
responses on the mailing list, and a lot more of the lower-level crud is
hidden with simpler utilities.
For general network installs (workstations, general purpose servers),
Kickstart was far easier to use and find answers for than FAI, although it
could use some of the abstraction that ROCKS has.
In the "modern" era of PXE, on "most networks" adding new machines
with specific configurations could be done with a single command or GUI.
We're not there yet. :-)
Paul
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From egan at sense.net Fri Mar 4 11:41:30 2005
From: egan at sense.net (Egan Ford)
Date: Fri, 4 Mar 2005 09:41:30 -0700
Subject: [Beowulf] Windows Server 2003 Compute Cluster Edition
In-Reply-To: <422889C3.5080902@charter.net>
Message-ID: <002201c520d9$04f6c380$e2054109@oberon>
Unless given away, better price/performance, or killer app, I estimate that
the number Windows HPC clusters to be very small. I'd like to say zero, but
I have customers today doing HPC on Windows. The apps are not available for
other platforms.
http://news.com.com/Windows+for+supercomputers+likely+out+by+fall/2100-1012_
3-5598603.html?part=rss&tag=5598782&subj=news
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From maurice at harddata.com Fri Mar 4 02:56:46 2005
From: maurice at harddata.com (Maurice Hilarius)
Date: Fri, 04 Mar 2005 00:56:46 -0700
Subject: [Beowulf] Re: Beowulf Digest, Vol 13, Issue 4
In-Reply-To: <200503031629.j23GThuk022748@bluewest.scyld.com>
References: <200503031629.j23GThuk022748@bluewest.scyld.com>
Message-ID: <422814BE.1070203@harddata.com>
Andrew Piskorski wrote:
>From: Andrew Piskorski
>Subject: Re: [Beowulf] 2.6.11 is out; with InfiBand support
>
>
>Except, a single PCI-X Infiniband card currently costs $1000 or so,
>right? (That's for a 4x 2 port card, but Froogle does not seem to
>know of any cheaper cards.)
>
>
New "name brand" ( sorry, it's under NDA) "Memory Free" cards will be
selling for under $400 for PCI Express 4X, and under $600 for PCI
Express 8X.
Availability Q2 for production quantities.
Still a lot more expensive than Myrinet, and Myri have their own
surprises to reveal in that time frame as well.
It's a great time for cluster computing:
Opteron Rev E
nForce4
dual core Opterons
S-ATA2
SAS
Economical 10Gb interconnects.
Wow.
With our best regards,
Maurice W. Hilarius Telephone: 01-780-456-9771
Hard Data Ltd. FAX: 01-780-456-9772
11060 - 166 Avenue email:maurice at harddata.com
Edmonton, AB, Canada http://www.harddata.com/
T5X 1Y3
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From agshew at gmail.com Fri Mar 4 10:51:06 2005
From: agshew at gmail.com (Andrew Shewmaker)
Date: Fri, 4 Mar 2005 08:51:06 -0700
Subject: [Beowulf] DCC (debian cluster components)
In-Reply-To: <1109928798.5537.22.camel@Vigor45>
References: <20050303163359.GJ4482@washoe.rutgers.edu>
<1109928798.5537.22.camel@Vigor45>
Message-ID:
On Fri, 04 Mar 2005 09:33:18 +0000, John Hearns
wrote:
> On Thu, 2005-03-03 at 11:33 -0500, Yaroslav Halchenko wrote:
> > Dear Debianized beowulfers or beowulfiezed Debian users,
> >
> > Does any one has experience with
> > http://www.irb.hr/en/cir/projects/dcc/
> > which recently was released?
> >
> > Project Goals
> >
> > We expect to integrate some existing technologies (like LDAP, System
> > Installation Suite,
>
> Seems strange that they haven't chosen FAI (Fully Automated Installer).
> As an aside, there was a poster about FAI up outside the cluster
> developer's room at FOSDEM last weekend.
If you think it is strange that they appear to have chosen System Installation
Suite over FAI because you are thinking that SIS is focused on RPM distros
(I was under that impression at one time), then you should know that SIS is
primarily developed on Debian.
--
Andrew Shewmaker
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From maillists at gauckler.ch Fri Mar 4 17:30:26 2005
From: maillists at gauckler.ch (Michael Gauckler)
Date: Fri, 04 Mar 2005 23:30:26 +0100
Subject: [Beowulf] MPI programming question: Interleaved MPI_Gatherv?
In-Reply-To: <1109659454.6544.2.camel@localhost.localdomain>
References: <1109659454.6544.2.camel@localhost.localdomain>
Message-ID: <1109975426.5116.16.camel@localhost.localdomain>
Dear List,
thank you for all your replies concerning my question about interleaved
gathers. (Interleaved from was meant in terms of memory layout, not
time of arrival of the message.)
Yes, there is a solution to this problem by changing the lower and upper
bounds of the datatype with the help of MPI_Type_create_resized.
Trough the lam-mpi mailing list I got a reply from Josh which I like to
share with you because it even includes the source of a demo application
(see below).
Thank you very much! Yours,
Michael
___
Von:
Josh Hursey
Datum:
Tue, 1 Mar 2005 09:50:43 -0500
(15:50 CET)
Yes, this can be achieved in an elegant way with MPI_Gather, but you
need to adjust the receive datatype. You will need to create a new
MPI_Datatype that will stride as you need it to. The trick is to shift
the lower and upper bounds on this new strided data type so it will
interleave values. Something like:
/* Create a datatype to receive into. */
MPI_Type_vector( NUM_LOCAL_ELE, /* # of blocks */
1, /* # of datatypes in a block (one for this
array) */
gsize, /* Stride between successive blocks */
MPI_CHAR, /* Type of each block */
&old_type);
MPI_Type_commit( &old_type);
/* Resize the type to allow interleaving,
* so make it only one MPI_CHAR wide
*/
MPI_Type_create_resized(old_type,
0, /* Lower Bound */
1, /* Uppoer Bound change to one block */
&new_type);
MPI_Type_commit( &new_type);
Then use the new_type as the receive type argument to the MPI_Gather
function. I attached a sample code that does exactly this, and produces
the following output:
$ mpirun -np 4 gather_interleave
Rank 0 A A A A A A A A
A A A A
Rank 1 B B B B B B B B
B B B B
Rank 2 C C C C C C C C
C C C C
Rank 3 D D D D D D D D
D D D D
Final:
A B C D A B C
D
A B C D
A B C D A B C
D
A B C D
A B C D A B C
D
A B C D
A B C D A B C
D
A B C D
Hope this helps.
Josh
--------------------
#include
#include
#define NUM_LOCAL_ELE 12
int main(int argc, char *argv[]){
int rank, gsize, i, j;
char local_array[NUM_LOCAL_ELE];
char *collected_array;
MPI_Datatype new_type, old_type;
/* Initialize */
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &gsize);
/* Create a datatype to receive into. */
MPI_Type_vector( NUM_LOCAL_ELE, /* # of blocks */
1, /* # of datatypes in a block (one for this
array) */
gsize, /* Stride between successive blocks */
MPI_CHAR, /* Type of each block */
&old_type);
MPI_Type_commit( &old_type);
/* Resize the type to allow interleaving,
* so make it only one MPI_CHAR wide
*/
MPI_Type_create_resized(old_type,
0, /* Lower Bound */
1, /* Uppoer Bound change to one block */
&new_type);
MPI_Type_commit( &new_type);
/* Initialize local array with characters:
* Rank 0 = A A A...
* Rank 1 = B B B...
* Rank 2 = C C C...
* ...
*/
for(i = 0; i < NUM_LOCAL_ELE; ++i ) {
local_array[i] = 'A' + rank;
}
/* Print out local array */
sleep(rank * 1);
printf("Rank %d", rank);
for(i = 0; i < NUM_LOCAL_ELE; ++i) {
printf("\t%c", local_array[i]);
}
printf("\n");
if(rank == 0)
collected_array = (char *)malloc(gsize * NUM_LOCAL_ELE *
sizeof(char));
MPI_Gather( local_array, NUM_LOCAL_ELE, MPI_CHAR, collected_array,
1, new_type, 0, MPI_COMM_WORLD);
/* Print out Gathered array */
if(rank == 0) {
printf("Final:\n");
for(i = 0; i < gsize; ++i) {
for(j = 0; j < NUM_LOCAL_ELE; ++j) {
printf("\t%c", collected_array[i*NUM_LOCAL_ELE+j]);
}
printf("\n");
}
}
if (rank == 0)
free(collected_array);
MPI_Finalize();
return 0;
}
Am Dienstag, den 01.03.2005, 07:44 +0100 schrieb Michael Gauckler:
> Dear List,
>
> I would like to gather the data from several processes.
> Instead of the comonly used stride, I want to interleave
> the data:
>
> Rank 0: AAAAA -> ABCDABCDABCDABCDABCD
> Rank 1: BBBBB ----^---^---^---^---^
> Rank 2: CCCCC -----^---^---^---^---^
> Rank 3: DDDDD ------^---^---^---^---^
>
> Since the stride of the receive type is indicated
> in multpiles of its mpi_type, no interleaving is
> possible (the smallest striping factor leads to
> AAAAABBBBBBCCCCCDDDDD).
>
> Is there a way to achieve this behaviour in an
> elegant way, as MPI_Gather promises it? Or do
> I need to do Send/Recv with self-aligned offsets?
>
> Thank you for your help!
>
> Michael
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From canon at nersc.gov Thu Mar 3 13:28:02 2005
From: canon at nersc.gov (Shane Canon)
Date: Thu, 03 Mar 2005 10:28:02 -0800
Subject: [Beowulf] RAID storage: Vendor vs. parts
In-Reply-To: <421C2DA4.8090608@psc.edu>
References:
<421C2DA4.8090608@psc.edu>
Message-ID: <42275732.2020306@nersc.gov>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
This was my experience as well. Most of the tools worked out of the
box, but the partitioning was a real hang up. The easiest way around
this is to have a separate boot drive that the installer can partition
in its normal manner and then to have the >2TB device be completely
separate and configured after first boot. You then use the whole drive
(no partitions). This worked for me.
- --Shane
Paul Nowoczynski wrote:
| Alvin Oga wrote:
|
|> hi ya steve
|>
|> On Tue, 22 Feb 2005, Steve Cousins wrote:
|>
|>
|>
|>> That's what I'm shooting for. Anybody have good luck with volumes
|>> greater
|>> than 2 TB with Linux? I think LSI SCSI cards are needed (?) and the 2.6
|>> Kernel is needed with CONFIG_LBD=y. Any hints or notes about doing this
|>> would be greatly appreciated. Google has not been much of a friend on
|>> this unfortunatlely. I'm guessing I'd run into NFS limits too.
|>>
|>
|>
|> for files/volumes over 2TB ... it's a question of libs, apps and
|> kernel everything has to work ... which is not always the case
|>
|>
|>
| We've got this working at PSC without too much pain.. even with scsi
| block devices >2TB. The LBD is needed but it
| doesn't solve all the problems with large disks, especially if you have
| a single volume which is larger than
| 2TB. The issue we ran into was that many disk related apps like mdadm
| and [s]fdisk don't support
| the BLKGETSIZE64 ioctl. So even though your kernel is using 64 bits,
| some needed apps are not. There are also issues with disklabels for
| devices >2TB. The normal dos-style disklabel used by linux
| doesn't support them so you'll need a kernel patch for the "plaintext"
| partition table made by Andries Brouwer.
| If you're interested in running this on 2.6 I can give you the patch.
| As far as cards go I think the adaptec u320 cards
| are better. I've seen less scsi timeout weirdness with them (this could
| be related to our disks). Performance wise
| the lsi and adaptec are about the same.. we see ~400MB/sec when using
| both channels - even with a sub pci-x bus. For a couple hundred bucks a
| card this is really good news.
| --paul
|
|> i don't play much with 2.6 kernels other than on suse-9.x boxes
|>
|>
|>
|>> Also, am I being overly cautious about having a spare RAID controller on
|>> hand? How frequent do RAID controllers go bad compared to disks, power
|>> supplies and fan modules? I'd guess that it would be very infrequent.
|>>
|>
|>
|> it's always better to have spare parts ... ( part of my requirement ) if
|> they expect the systems to be available 24x7 ...
|> - more importantly, how long can they wait, when silly inexpensive
|> things die, before it gets replaced
|>
|> - dead fans is $2.oo - $15 each to keep the disks cool
|>
|> - power supply is $50 range ... but if one bought n+1 powersupply
|> than its supposed to not be an issue anymore, but you will need to
|> have its replacement handy
|>
|> - raid controllers should NOT die, nor cpu, mem, mb, nic, etc
|> and it's not cheap to have these items floating around as spare
|> parts
|>
|> - ethernet cables will go funky if random people have access
|> to the patch panels ... ( keep the fingers away )
|>
|> - ups will go bonkers too
|>
|> - what failure mode can one protect against and what will happen
|> if "it" dies
|> - best protection against downtime for users is to have an
|> warm-swap server which is updated a hourly or daily ... ( my
|> preference - 2nd identical or bigger-disk capacity system )
|>
|>
|>
|>> Looking back at my own experience I think I've had to return one out
|>> of 15
|>> in the last eight years, and that was bad as soon as I bought it.
|>>
|>
|>
|> seems too high of a return rate ?? 1 out of 15 ??
|>
|>
|>
|>> If this is too off-topic let me know and I'll move it elsewhere.
|>>
|>
|>
|> ditto here
|> 24x7x365 uptime compute environment is fun/frustrating stuff on tight
|> budgets
|>
|> c ya
|> alvin
|>
|> _______________________________________________
|> Beowulf mailing list, Beowulf at beowulf.org
|> To change your subscription (digest mode or unsubscribe) visit
|> http://www.beowulf.org/mailman/listinfo/beowulf
|>
|>
|
| _______________________________________________
| Beowulf mailing list, Beowulf at beowulf.org
| To change your subscription (digest mode or unsubscribe) visit
| http://www.beowulf.org/mailman/listinfo/beowulf
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFCJ1cxZd/2zrI5CioRAnYnAJ98qtd17aPK62aCw4UNt79klUdasQCglLyo
kNXL0h7KGaQfFmla33Gxfn4=
=osvt
-----END PGP SIGNATURE-----
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From taj at www.linux.org.uk Thu Mar 3 17:39:28 2005
From: taj at www.linux.org.uk (Trent Jarvi)
Date: Thu, 3 Mar 2005 22:39:28 +0000 (GMT)
Subject: [Beowulf] http://www.beowulf.org/overview/history.html
Message-ID:
Just a heads up.
This page appears to be corrupted.
While not all Beowulf clusters are supercomputers, one can build a Beowulf
that is powerful enough to attract the interest of supercomputer users.
Beyond the seasoned parallel programmer, Beowulf clusters have been built
and used by programmers with little or no parallel programming experience.
Beowulf clusters provide universities, often with limited resources, an
excellent platform to teach parallel programming cNvq0ZhTgBrP
kOLZWGuE0+ZiqlFOd2ml5US6LXQ/8jfnOSP4wydRdXTBOTOpewexZw1KyyFaZYgXTx5zQTNf
5QFWN4fE0H3CCkPYVhNTdPWIDurIhwMLdwxbCTM6fcG3+JA+1TpQX+s5ZlYw5+bvDqkre+1Y
[...]
--
Trent Jarvi
taj at www.linux.org.uk
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From djholm at fnal.gov Fri Mar 4 13:43:59 2005
From: djholm at fnal.gov (Don Holmgren)
Date: Fri, 04 Mar 2005 12:43:59 -0600
Subject: [Beowulf] 2.6.11 is out; with InfiBand support
In-Reply-To:
References:
<4228821F.90607@charter.net> <42288715.10403@scalableinformatics.com>
<422889C3.5080902@charter.net>
Message-ID:
I've made two purchases in the last 12 months of 24-port switches.
Two switches last April came in at ~ $4000 each.
16 switches last Sept came in at ~ $3300 each.
These were two different brands of switches, both based on the Mellanox
Infiniscale III (24 port crossbar) silicon.
Clearly YMMV on pricing.
Don Holmgren
On Fri, 4 Mar 2005, Roger L. Smith wrote:
>
>
> THe price I stated was for a 24 port switch for around $8,000 list. As a
> matter of fact, I just confirmed this with the vendor.
>
> This does not include cables or HCAs.
>
> On Fri, 4 Mar 2005, Jeffrey B. Layton wrote:
>
> > 8 ports under 8k, but it was a 24 port switch :)
> > This includes all of the HCA's, switches (only one),
> > cables, and software.
> >
> > Jeff
> >
> > > 8 ports under 8k or 24 ports under 8k?
> > >
> > > Jeffrey B. Layton wrote:
> > >
> > >> However, to match what Roger said, one IB vendor gave me
> > >> a list price for 8-ports of IB for under $8,000.
> > >
> >
>
>
> _\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_
> | Roger L. Smith Phone: 662-325-3625 |
> | Sr. Systems Administrator FAX: 662-325-7692 |
> | roger at ERC.MsState.Edu http://WWW.ERC.MsState.Edu/~roger |
> | Mississippi State University |
> |____________________________________ERC__________________________________|
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From djholm at fnal.gov Thu Mar 3 17:31:56 2005
From: djholm at fnal.gov (Don Holmgren)
Date: Thu, 03 Mar 2005 16:31:56 -0600
Subject: [Beowulf] 2.6.11 is out; with InfiBand support
In-Reply-To: <20050303185516.GB1453@greglaptop.internal.keyresearch.com>
References:
<20050303020600.GA56437@piskorski.com>
<20050303185516.GB1453@greglaptop.internal.keyresearch.com>
Message-ID:
On Thu, 3 Mar 2005, Greg Lindahl wrote:
> On Thu, Mar 03, 2005 at 08:58:33AM -0600, Don Holmgren wrote:
>
> > I was just quoted quantity 2 PCI-X HCA's (4x 2 port Mellanox) by a
> > reseller for $655 each. Last Fall we purchased a large quantity of
> > PCI-E HCA's for considerably less than that unit price.
> >
> > Supposedly the "memory free" PCI-E HCA's that use host memory, rather
> > than on-board sram, should move prices towards $100 sometime this
> > year - we'll see (landed on motherboard pricing at ~ $70, see
> > http://www.mellanox.com/news/press/pr_030105.html)
>
> You're mixing retail price with wholesale price. The $69 price is
> apparently quantity 10,000, for just the chip, and the card that
> is inexpensive will be PCIe 4X, which hurts performance.
Yes, it will be interesting to see what the motherboard vendors charge
for an IB option. $69 (assuming they hit the 10K volume) + the price of
the I/O connector + engineering cost + margin. I'm hoping that it will
be < $150, but I may be too optimistic.
Interesting comment about PCIe 4X hurting performance, thanks! The
current PCI-E cards have two ports and use an 8X slot, but I'd guess
that most cluster applications use only a single port. What's the
performance penalty for using a single 4X port HCA in a 4X PCI-E slot
compared with using a single port on a dual port card in an 8X slot?
I believe the MemFree cards also incur a few tenths of a microsecond
latency hit because of the need to access host memory, at least
accordin?g to the preliminary benchmarks shown at SC'04.
>
> I hear that today's street price for Mellanox-based cards is ~$600 in
> cluster-sized quantities, which matches what you report.
We did a little better than that - for quantity 260, we paid < $450 for
PCI-E HCA's. A couple of other bids were around $500.
>
> > Like other high performance network gear, it's tough to get
> > accurate pricing information without going out and getting quotes.
>
> One nice thing about Myricom is that their prices have always been on
> the web -- all you need to know in addition is your discount.
>
> -- greg
Agreed. Among the many other nice things about Myricom.
Don
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From mwill at penguincomputing.com Fri Mar 4 18:52:18 2005
From: mwill at penguincomputing.com (Michael Will)
Date: Fri, 4 Mar 2005 15:52:18 -0800
Subject: [Beowulf] HA OSCAR for loadbalancing and failover
In-Reply-To: <006d01c52086$8c50a8d0$0f120897@PMORND>
References: <006d01c52086$8c50a8d0$0f120897@PMORND>
Message-ID: <200503041552.18491.mwill@penguincomputing.com>
Even though it is an interesting idea to use a beowulf cluster for this,
in particular when using several nodes to do loadbalancing with and
automatic deployment of services, I think it is the wrong tool for the
task you have set for yourself.
Your requirements would probably be more easily fulfilled with a simple
HA failover cluster (no oscar involved). See http://www.ultramonkey.org for details.
Especially when you only have two servers, one as primary and one as standby,
which is a classical active/passive config, then there is no reason to have the
complexity of a beowulf style cluster.
ultramonkey.org also mentions LVS which helps with loadbalancing and I believe
they even have a solution for session synchronisation, which means that even
when a failover of the loadbalancer occurs, a tcp/ip session does not die but
gets redirected.
You will not need any PBS then, but rather have a package called 'heartbeat' that
defines the services to be failed over in it's own config files.
Michael
On Thursday 03 March 2005 10:51 pm, Rajiv wrote:
> Dear Sir,
> i am carrying outloadbalancing using OSCAR-3.0.We are also carrying out
> failover using HA-OSCAR 1.0 beta release (High Availability OSCAR). We are
> required to acheive loadbalancing
> and failover for the following services:
> 1. HTTP.
> 2. FTP.
> 3. TELNET.
> 4. DHCP.
> 5. SQUID.
> Our setup is as follows:
> 1 Primary server , 1 client node (using OSCAR-3.0)
> 1 standby server (using HA OSCAR)
>
> We have succeeded in building the cluster but am
> having problems regarding loadbalancing.We are trying to achieve
> loadbalancing using the PBS Server(Portable Batch System) which comes
> inbuilt with OSCAR-3.0.We are queing the services as jobs and trying to
> distribute these jobs between the server and client node. But the problem
> we are facing is that we are not able to submit the job to the PBS server.
> Sir,firstly, we would like you to confirm if we are going on the right
> track for achieving loadbalancing.We would like to know how you'll have
> achieved load balancing?
>
> Regards,
> Rajiv
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
--
Michael Will, Linux Sales Engineer
Tel: 415-954-2822 Toll Free: 888-PENGUIN
Fax: 415-954-2899 www.penguincomputing.com
Visit us at FOSE 2005!
Washington Convention Center, Washington, DC
April 5th-7th, 2005
Linux Pavilion, Booth 2225
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From eugen at leitl.org Sat Mar 5 03:00:01 2005
From: eugen at leitl.org (Eugen Leitl)
Date: Sat, 5 Mar 2005 09:00:01 +0100
Subject: [Beowulf] HA OSCAR for loadbalancing and failover
In-Reply-To: <200503041552.18491.mwill@penguincomputing.com>
References: <006d01c52086$8c50a8d0$0f120897@PMORND>
<200503041552.18491.mwill@penguincomputing.com>
Message-ID: <20050305075952.GH13336@leitl.org>
On Fri, Mar 04, 2005 at 03:52:18PM -0800, Michael Will wrote:
> Especially when you only have two servers, one as primary and one as standby,
> which is a classical active/passive config, then there is no reason to have the
> complexity of a beowulf style cluster.
http://www.linux-ha.org/ (1.99?) supports up to 8 and beyond, but it needs some
testing.
--
Eugen* Leitl leitl
______________________________________________________________
ICBM: 48.07078, 11.61144 http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE
http://moleculardevices.org http://nanomachines.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL:
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From gmpc at sanger.ac.uk Sat Mar 5 05:45:27 2005
From: gmpc at sanger.ac.uk (Guy Coates)
Date: Sat, 5 Mar 2005 10:45:27 +0000 (GMT)
Subject: [Beowulf] DCC (debian cluster components)
In-Reply-To:
References: <20050303163359.GJ4482@washoe.rutgers.edu>
<1109928798.5537.22.camel@Vigor45>
Message-ID:
> In the "modern" era of PXE, on "most networks" adding new machines
> with specific configurations could be done with a single command or GUI.
> We're not there yet. :-)
The only thing that ever came close was RLX's Control Tower management
software. It did "one click" management and provisioning of machines.
One the blade systems it could even do "zero click configuration", as you
could set policy like
"Any machines I put into slots 1-10 should automatically get configuration
Y put on them."
It was generic enough so that it could provision any operating system you
could think off, and if it didn't do something you wanted it to, it was
also easy to dig under the covers and hack the code.
The only down side was the price-tag, which was extortionate.
Guy
--
Dr. Guy Coates, Informatics System Group
The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK
Tel: +44 (0)1223 834244 ex 7199
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From kus at free.net Sat Mar 5 07:59:19 2005
From: kus at free.net (Mikhail Kuzminsky)
Date: Sat, 05 Mar 2005 15:59:19 +0300
Subject: [Beowulf] 2.6.11 is out; with InfiBand support
In-Reply-To:
Message-ID:
In message from Mark Hahn (Wed, 2 Mar 2005
18:09:09 -0500 (EST)):
>> Arima and Iwill have mobos with IB LOM (Landed on Motherboard).
>
>given the choice between a $150 pcie IB nic and having it onboard,
>I'd choose the separate card. I know the IB salesdroids always
>say that getting onto the MB will change everything, but this
>doesn't make sense. IB is completely different from onboard gigabit,
>for instance, because there is no ubiquitous IB infrastructure
>ready, waiting to be exploited.
>
>the problem with "if you build it onboard, they will come" is also
>the marginal cost. onboard gigabit is nearly the same cost as
>onboard 100bT, very low, and you pretty much always want it.
>onboard IB is noticably higher than onboard GBE, noticable in
>absolute terms, and you definitely have no possible use for it
>on many systems.
>
>remember, most people don't even saturate GBE yet,
Yes, I agree. But we are developing some quantum-chemical application
which speedup at parallelization is bandwith-limited. And we obtain
that speedup on 6 processors w/IB 4x interconnect is about 34% percent
higher than for Myrinet.
Yours
Mikhail Kuzminsky
Zelinsky Institute of Organic Chemistry
Moscow
> and GBE
>ports are damned cheap. GBE nics are free, and switch ports
>are now down to $US 23/port:
>
>http://froogle.google.com/froogle?q=netgear+GS748T&btnG=Search+Froogle
>
>fundamentally, IB is still facing most of the same problems it always
>has:
>
>- requires fairly expensive, unique infrastructure
>- not the greatest physical layer: it's easy to wind up with
> literally tons of IB cables.
>- not clearly superior in performance vs alternatives.
>- apparently designed by people who disliked existing technique
> or were ignorant of it.
>- not a drop-in replacement for alternatives.
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit
>http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rgb at phy.duke.edu Sun Mar 6 09:39:06 2005
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Sun, 6 Mar 2005 09:39:06 -0500 (EST)
Subject: [Beowulf] http://www.beowulf.org/overview/history.html
In-Reply-To:
References:
Message-ID:
On Thu, 3 Mar 2005, Trent Jarvi wrote:
>
> Just a heads up.
>
> This page appears to be corrupted.
>
> While not all Beowulf clusters are supercomputers, one can build a Beowulf
> that is powerful enough to attract the interest of supercomputer users.
> Beyond the seasoned parallel programmer, Beowulf clusters have been built
> and used by programmers with little or no parallel programming experience.
> Beowulf clusters provide universities, often with limited resources, an
> excellent platform to teach parallel programming cNvq0ZhTgBrP
> kOLZWGuE0+ZiqlFOd2ml5US6LXQ/8jfnOSP4wydRdXTBOTOpewexZw1KyyFaZYgXTx5zQTNf
> 5QFWN4fE0H3CCkPYVhNTdPWIDurIhwMLdwxbCTM6fcG3+JA+1TpQX+s5ZlYw5+bvDqkre+1Y
>
> [...]
Corrupted and out of date, too:-)
Nobody who looks at the top500 list (whatever my opinions about its
basis;-) would nowadays say that one can "build a Beowulf that is
powerful enough to attract the interest of supercomputer users".
It's getting to be much more of a "seasoned parallel programmers (a.k.a.
`old guys') can remember a time when parallel programming was carried
out on `supercomputers', basically a name for a cluster with proprietary
internal processor interconnects".
Linux hasn't finished taking over the world, although it continues to
make excellent progress with all sorts of economic and historical forces
driving it. "Beowulfs" in the generic sense of COTS clusters with
network interconnects for IPCs, pretty much have taken over the
supercomputing world with only a few exceptions, and even those
exceptions are relying less and less on anything like a custom
communications bus. Not even the engineering of the dedicated systems
scales, while using a "COTS" communication platform such as Myri or
Dolphinics, or IB or even gigE lets you leverage all sorts of useful
work done by other humans devoted to this one purpose or this purpose
among others.
rgb
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rgb at phy.duke.edu Sun Mar 6 10:04:17 2005
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Sun, 6 Mar 2005 10:04:17 -0500 (EST)
Subject: [Beowulf] DCC (debian cluster components)
In-Reply-To:
References: <20050303163359.GJ4482@washoe.rutgers.edu>
<1109928798.5537.22.camel@Vigor45>
Message-ID:
On Sat, 5 Mar 2005, Guy Coates wrote:
> > In the "modern" era of PXE, on "most networks" adding new machines
> > with specific configurations could be done with a single command or GUI.
> > We're not there yet. :-)
>
> The only thing that ever came close was RLX's Control Tower management
> software. It did "one click" management and provisioning of machines.
> One the blade systems it could even do "zero click configuration", as you
> could set policy like
>
> "Any machines I put into slots 1-10 should automatically get configuration
> Y put on them."
>
> It was generic enough so that it could provision any operating system you
> could think off, and if it didn't do something you wanted it to, it was
> also easy to dig under the covers and hack the code.
>
> The only down side was the price-tag, which was extortionate.
>
> Guy
>
>
I agree that there is some work involved in building a PXE installable
configuration for e.g. kickstart, but it isn't excessive. Take template
kickstart file(s). Edit it to select package set(s) for same(different)
node config(s). Put it(them) on server. Edit dhcpd.conf to to point to
bootloader in /tftpboot. Edit boot.msg, pxelinux.cfg/default to point
to (list of) node type(s) and point to the associated kickstart file(s)
and boot option(s), respectively. Boot.
There IS a GUI for building the kickstart file (under RH and FC, at
least), although I suspect that most people would use this at most to
build the template and then tune it up by hand -- it is actually easier
to edit a flatfile with an editor once you see the layout.
dhcpd doesn't have a GUI to front it AFAIK, so this remains a place one
could do some work. It would be useful to build one that tests the
URL paths to e.g. the kickstart files and the tftpboot paths to the
initrd images. It would be even lovelier to have the same interface
edit boot.msg and pxelinux.cfg/default at the same time so that all
three could be consistent. This is the one place I find myself hopping
from directory to directory to make matching changes, as I create an
image for "fc3 workstation" or "fc3 node" for testing purposes and need
to ensure that the matching kickstart file is in the right place and
correctly corresponds to the dhcpd entry. This is even more true if one
wants to create an image indexed per IP number (the "right" way,
arguably, for tftpboot to function) so that everything becomes totally
automagic.
I always am of two minds about high-level front ends to low level admin
commands. Yes, they are convenient and let newbies start to work with a
low buy-in as far as learning curve is concerned. However, they also
SHIELD a newbie from learning what they really need to know to be an
effective manager, and (to my own experience anyway) they rarely work
stably as the various tools upon which they are built evolve. For one
thing, the GUI designer almost always omits features and capabilities of
the lower level stuff, so eventually you want to do something you "know"
can be done but the GUI doesn't. For another, somebody changes
something really subtle, such as a default pathway in /tftpboot, and
your GUI "breaks" and you have no idea why and won't until you learn,
in a panic, all the things the GUI shielded you from.
I tend to think GUIs work best when they are a standard part of a single
administrative package co-developed by the packages maintainers. A GUI
that spans multiple tools and functions simply has more issues. Hence
redhat-config-kickstart has a chance to remain useful, but building its
functionality into a sweeping redhat-config-pxe (including kickstart,
dchpd, tftp) is a bit dicier.
STILL POSSIBLE, mind you -- it just needs somebody to love it and
maintain it. That's why I asked about FAI -- if somebody doesn't
actively maintain the GUIs, the scripts, the documentation, as the
underlying toolset slowly changes, eventually the friendly front end
fails to encompass all sorts of desireable functionality or starts to
break, sometimes, for some users, trying to do some things. In FC or
RH, there is no "supertool", but the individual tools work well and
aren't THAT hard to learn -- most of them have dedicated HOWTOs. Some
supertools exist in ROCKS and warewulf and so on, and have (at the
moment) devoted maintainers who love their product.
rgb
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From landman at scalableinformatics.com Mon Mar 7 13:01:20 2005
From: landman at scalableinformatics.com (Joe Landman)
Date: Mon, 07 Mar 2005 13:01:20 -0500
Subject: [Beowulf] looking for a reference on failure rates
Message-ID: <422C96F0.4090406@scalableinformatics.com>
Hi folks:
I am looking for a reference which describes failure rates of modern
computer components as a function of temperature. The usual rule of
thumb is that every 10 degrees above a certain value doubles the failure
rate (or decreases lifetime). I would like to look at this analysis and
refer to it for something I am working on.
Thanks
Joe
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From James.P.Lux at jpl.nasa.gov Mon Mar 7 15:00:25 2005
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Mon, 07 Mar 2005 12:00:25 -0800
Subject: [Beowulf] looking for a reference on failure rates
In-Reply-To: <422C96F0.4090406@scalableinformatics.com>
References: <422C96F0.4090406@scalableinformatics.com>
Message-ID: <6.1.1.1.2.20050307115932.041dcf70@mail.jpl.nasa.gov>
At 10:01 AM 3/7/2005, Joe Landman wrote:
>Hi folks:
>
> I am looking for a reference which describes failure rates of modern
> computer components as a function of temperature. The usual rule of
> thumb is that every 10 degrees above a certain value doubles the failure
> rate (or decreases lifetime). I would like to look at this analysis and
> refer to it for something I am working on.
>
> Thanks
>
>Joe
How rigorous a reference? Or a general description of failure rates vs
temp, for, e.g. microprocessors.
James Lux, P.E.
Spacecraft Radio Frequency Subsystems Group
Flight Communications Systems Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From landman at scalableinformatics.com Mon Mar 7 15:03:46 2005
From: landman at scalableinformatics.com (Joe Landman)
Date: Mon, 07 Mar 2005 15:03:46 -0500
Subject: [Beowulf] looking for a reference on failure rates
In-Reply-To: <6.1.1.1.2.20050307115932.041dcf70@mail.jpl.nasa.gov>
References: <422C96F0.4090406@scalableinformatics.com>
<6.1.1.1.2.20050307115932.041dcf70@mail.jpl.nasa.gov>
Message-ID: <422CB3A2.1020307@scalableinformatics.com>
Hi Jim:
Something I can refer to for primary literature for a paper. If it
is anecdotal, that may be fine as well, though I will have to treat it
differently.
This is largely for microprocessors, disks, networks, etc. General
digital equipment, with a focus on computers in clusters.
Thanks!
Joe
Jim Lux wrote:
> At 10:01 AM 3/7/2005, Joe Landman wrote:
>
>> Hi folks:
>>
>> I am looking for a reference which describes failure rates of modern
>> computer components as a function of temperature. The usual rule of
>> thumb is that every 10 degrees above a certain value doubles the
>> failure rate (or decreases lifetime). I would like to look at this
>> analysis and refer to it for something I am working on.
>>
>> Thanks
>>
>> Joe
>
> How rigorous a reference? Or a general description of failure rates vs
> temp, for, e.g. microprocessors.
>
>
>
> James Lux, P.E.
> Spacecraft Radio Frequency Subsystems Group
> Flight Communications Systems Section
> Jet Propulsion Laboratory, Mail Stop 161-213
> 4800 Oak Grove Drive
> Pasadena CA 91109
> tel: (818)354-2075
> fax: (818)393-6875
>
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 734 786 8452
cell : +1 734 612 4615
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From djholm at fnal.gov Mon Mar 7 14:59:56 2005
From: djholm at fnal.gov (Don Holmgren)
Date: Mon, 07 Mar 2005 13:59:56 -0600
Subject: [Beowulf] looking for a reference on failure rates
In-Reply-To: <422C96F0.4090406@scalableinformatics.com>
References: <422C96F0.4090406@scalableinformatics.com>
Message-ID:
On Mon, 7 Mar 2005, Joe Landman wrote:
> Hi folks:
>
> I am looking for a reference which describes failure rates of modern
> computer components as a function of temperature. The usual rule of
> thumb is that every 10 degrees above a certain value doubles the failure
> rate (or decreases lifetime). I would like to look at this analysis and
> refer to it for something I am working on.
>
> Thanks
>
> Joe
>
>
Joe -
Take a look at this Test and Measurement World article for starters:
http://www.reed-electronics.com/tmworld/article/CA187523.html
The rule of thumb that you mention comes from using an Arrhenius model
to describe the relationship between temperature and failure rates.
Arrhenius first published this equation (now named after him) in 1889
k(T) = A exp ( -Ea / RT)
to explain the variation of reaction rates with temperature of several
elementary chemical reactions. Here, k is the reaction rate, A is a
constant, Ea is the activation energy for the reaction, R is the ideal
gas constant, and T is the temperature in Kelvin. It turns out that
many semiconductor degradation mechanisms - electromigration, corrosion,
defect growth, etc. - fit this relationship well. Note that you'll
usually see Boltzmann's constant (another 'k') instead of 'R' in the
semiconductor reliability literature. Chemists use R and express Ea in
units of kJoule/mole, physicists and engineers tend to use k and express
Ea in electron volts.
In the reliability literature, you'll often see the Arrhenius model
written in term of time to failure, which is proportional to the inverse
of the reaction rate. At two different temperatures T1 and T2, the
times to failure would be given by
t1 = A exp (Ea / kT1) # k = Boltzmann's constant
t2 = A exp (Ea / kT2)
and so the ratio of lifetimes is given by
t1/t2 = exp [ Ea/k * (1/T1 - 1/T2) ]
If t1 is room temperature (~ 298K), an activation energy of about
0.54 eV would give a doubling in failure rate at a 10 degree C higher
temperature.
There's a handy chemist's page at
http://antoine.frostburg.edu/chem/senese/101/kinetics/faq/temperature-and-reaction-rate.shtml
that will let you plug in 3 of the 4 variables (T1, T2, Ea, reaction
rate ratio) and it will give you the third.
I've got a number of semiconductor reliability texts with tables of Ea
versus failure mechanism - I can post the references if you request,
though they're a bit dated (15 years old). Ea varies widely in these
tables from about 0.3 eV to as high as 2.0 eV. There are even some
negative Ea's, corresponding to failure mechanisms that decelerate with
increasing temperature. The "factor of 2 with every 10 degrees" is only
a very rough rule of thumb.
Don Holmgren
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From landman at scalableinformatics.com Mon Mar 7 15:31:53 2005
From: landman at scalableinformatics.com (Joe Landman)
Date: Mon, 07 Mar 2005 15:31:53 -0500
Subject: [Beowulf] looking for a reference on failure rates
In-Reply-To:
References: <422C96F0.4090406@scalableinformatics.com>
Message-ID: <422CBA39.60600@scalableinformatics.com>
Hi Don:
This is excellent. More detail than I need for this, but useful
nonetheless. I am familiar with/have used Arrhenius models for rate
prediction in the past, but did not make the connection to failure rates.
The next question that this begs is the failure mode. Each failure
mode will likely have a different set of activation energies.
I'll go grab this info (have some old stuff around). Any similarity
for disk drives and other components? Of course the question of what
the activation energy would mean for a macroscopic failure (hard disks)
is relevant ...
Thanks!
Joe
Don Holmgren wrote:
>
> On Mon, 7 Mar 2005, Joe Landman wrote:
>
>
>>Hi folks:
>>
>> I am looking for a reference which describes failure rates of modern
>>computer components as a function of temperature. The usual rule of
>>thumb is that every 10 degrees above a certain value doubles the failure
>>rate (or decreases lifetime). I would like to look at this analysis and
>>refer to it for something I am working on.
>>
>> Thanks
>>
>>Joe
>>
>>
>
>
> Joe -
>
> Take a look at this Test and Measurement World article for starters:
>
> http://www.reed-electronics.com/tmworld/article/CA187523.html
>
> The rule of thumb that you mention comes from using an Arrhenius model
> to describe the relationship between temperature and failure rates.
> Arrhenius first published this equation (now named after him) in 1889
>
> k(T) = A exp ( -Ea / RT)
>
> to explain the variation of reaction rates with temperature of several
> elementary chemical reactions. Here, k is the reaction rate, A is a
> constant, Ea is the activation energy for the reaction, R is the ideal
> gas constant, and T is the temperature in Kelvin. It turns out that
> many semiconductor degradation mechanisms - electromigration, corrosion,
> defect growth, etc. - fit this relationship well. Note that you'll
> usually see Boltzmann's constant (another 'k') instead of 'R' in the
> semiconductor reliability literature. Chemists use R and express Ea in
> units of kJoule/mole, physicists and engineers tend to use k and express
> Ea in electron volts.
>
> In the reliability literature, you'll often see the Arrhenius model
> written in term of time to failure, which is proportional to the inverse
> of the reaction rate. At two different temperatures T1 and T2, the
> times to failure would be given by
>
> t1 = A exp (Ea / kT1) # k = Boltzmann's constant
> t2 = A exp (Ea / kT2)
>
> and so the ratio of lifetimes is given by
>
> t1/t2 = exp [ Ea/k * (1/T1 - 1/T2) ]
>
> If t1 is room temperature (~ 298K), an activation energy of about
> 0.54 eV would give a doubling in failure rate at a 10 degree C higher
> temperature.
>
> There's a handy chemist's page at
>
> http://antoine.frostburg.edu/chem/senese/101/kinetics/faq/temperature-and-reaction-rate.shtml
>
> that will let you plug in 3 of the 4 variables (T1, T2, Ea, reaction
> rate ratio) and it will give you the third.
>
> I've got a number of semiconductor reliability texts with tables of Ea
> versus failure mechanism - I can post the references if you request,
> though they're a bit dated (15 years old). Ea varies widely in these
> tables from about 0.3 eV to as high as 2.0 eV. There are even some
> negative Ea's, corresponding to failure mechanisms that decelerate with
> increasing temperature. The "factor of 2 with every 10 degrees" is only
> a very rough rule of thumb.
>
> Don Holmgren
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 734 786 8452
cell : +1 734 612 4615
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From JSaxelby at vicr.com Mon Mar 7 15:24:04 2005
From: JSaxelby at vicr.com (Saxelby, John)
Date: Mon, 7 Mar 2005 15:24:04 -0500
Subject: [Beowulf] The Arrhenius Equation and life
Message-ID:
I am going out on a limb here but from my work I have concluded that the 10 degrees temp rise halving life is bogus. It is based on the Arrhenius Equation which is for chemical reactions. It does not apply to semiconductors for example, though it might apply to the oil in the bearings of a hard disk. This is a controversial subject and the answer is not as simple as 10C 2X
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From James.P.Lux at jpl.nasa.gov Mon Mar 7 18:26:17 2005
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Mon, 07 Mar 2005 15:26:17 -0800
Subject: [Beowulf] looking for a reference on failure rates
In-Reply-To: <422CB3A2.1020307@scalableinformatics.com>
References: <422C96F0.4090406@scalableinformatics.com>
<6.1.1.1.2.20050307115932.041dcf70@mail.jpl.nasa.gov>
<422CB3A2.1020307@scalableinformatics.com>
Message-ID: <6.1.1.1.2.20050307150113.042f5398@mail.jpl.nasa.gov>
At 12:03 PM 3/7/2005, Joe Landman wrote:
>Hi Jim:
>
> Something I can refer to for primary literature for a paper. If it is
> anecdotal, that may be fine as well, though I will have to treat it
> differently.
>
> This is largely for microprocessors, disks, networks, etc. General
> digital equipment, with a focus on computers in clusters.
>
> Thanks!
>
>J
One of our reliability guys recommended this:
E.A. Amerasekera & F.N. Najim, "Failure Mechanisms in Semiconductor
Devices", 2nd Ed., Wiley, NY, 1997
You might also take a look at MIL-HDBK-217F, which provides reliability
math models for just about everything electronic. There might be some
argument about the applicability of this in some instances, but it's
certainly a commonly used document. Chapter 5 talks about
microcircuits. Section 5.8 has the temperature factors (Ea in eV) for
various logical families... CMOS looks like 0.35, BiCMOS and LSTTL are 0.5,
Linears are 0.65
(Converting to life effects, it looks like a 20C rise in temp corresponds
to twice the failure rate for CMOS, and a 20C rise is a 4.9 factor increase
for Linears (10 deg= 2.3))... the actual assembly failure rate will depend
on how many of each kind of part, what temperature they're at, etc.
Something like a doubling per 10C is probably not too far from the overall
effect. It's not going to be 10 times and it's not going to be 10%
increase either.
There's also a BellCore/Telcordia model which apparently takes into account
burnin and testing. It might be more relevant, depending on the environment.
James Lux, P.E.
Spacecraft Radio Frequency Subsystems Group
Flight Communications Systems Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From jake at spiekerfamily.com Tue Mar 8 19:49:14 2005
From: jake at spiekerfamily.com (Jake Thebault-Spieker)
Date: Tue, 08 Mar 2005 19:49:14 -0500
Subject: [Beowulf] DCC information?
Message-ID: <422E480A.4070007@spiekerfamily.com>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
hi all,
I've noticed that within the past week there have been some mentions of
DCC. Does anyone know of a HOWTO, or a writeup somewhere that I can
learn how to use it? I tried installing it, and didn't understand half
of it. But it is now installed on what will be my master node. Thoughts?
- --
I think computer viruses should count as life.
I think it says something about human nature
that the only form of life we have created so far is purely destructive.
We've created life in our own image.
- --Stephen Hawking
010010100110000101101
011011001010010000001
010100011010000110010
101100010011000010111
010101101100011101000
010110101010011011100
000110100101100101011
010110110010101110010
/www.plinko.net\\>
Jake Thebault-Spieker
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFCLkgJI2YvXV9Bxi0RAnEDAJsGnN/eyA22+o9TWDpkUWd3vjXuaQCeLKfH
EXdUrEQYjO9dlgTi5Knidbw=
=XOkc
-----END PGP SIGNATURE-----
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From sadat.ali.khan at gmail.com Wed Mar 9 10:36:33 2005
From: sadat.ali.khan at gmail.com (sadat khan)
Date: Wed, 9 Mar 2005 21:06:33 +0530
Subject: [Beowulf] queries
Message-ID:
I would like to know from the esteemed members as to what really are
MPI and PVM and have become outdated???
Another question is what is rock ?
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From tim at linux-force.com Wed Mar 9 11:46:23 2005
From: tim at linux-force.com (tim at linux-force.com)
Date: Wed, 9 Mar 2005 11:46:23 -0500 (EST)
Subject: [Beowulf] StorCloud call for participation extended
Message-ID:
All,
The deadline has been extended to participate in the StorCloud challenge.
Please visit the URL below, but ignore the March 1st deadline.
http://www.vtksolutions.com/StorCloud/2005/applications.html
Regards,
Tim Wilcox
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From john.hearns at streamline-computing.com Wed Mar 9 16:10:46 2005
From: john.hearns at streamline-computing.com (John Hearns)
Date: Wed, 09 Mar 2005 21:10:46 +0000
Subject: [Beowulf] queries
In-Reply-To:
References:
Message-ID: <1110402646.5643.127.camel@Vigor11>
On Wed, 2005-03-09 at 21:06 +0530, sadat khan wrote:
> I would like to know from the esteemed members as to what really are
> MPI and PVM and have become outdated???
> Another question is what is rock ?
Sadat,
Rocks is a clustering distribution.
http://rocks.npaci.edu/Rocks/
It is a distribution of Linux which makes it easy to construct a
Beowulf
There is also Rock Linux, which I was interested in at one point.
But I don't think this is what you are interested in.
http://www.rocklinux.org/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From jonas.palencia at abbott.com Wed Mar 9 17:31:28 2005
From: jonas.palencia at abbott.com (Jonas M Palencia)
Date: Wed, 9 Mar 2005 17:31:28 -0500
Subject: [Beowulf] scyld beowulf beoboot-install utility
Message-ID:
Hi All,
We are running Scyld 28cz-7 on our cluster.
One of our nodes (Compute Node 0) in the cluster was replaced because of
bad motherboard. So the MAC address has changed. The hard disk wasn't
changed but some linux was installed into it for testing purposes. I'm
trying to add this node back to the cluster. Using beosetup, the new MAC
address was registered as node 0.
I tried to partition the disk using the beofdisk tool, then I restarted
the node:
----------------------------------
[root at abcmc02 fdisk]# beofdisk -w -n 0
Disk /dev/hda: 4865 cylinders, 255 heads, 63 sectors/track
Old situation:
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0
Device Boot Start End #cyls #blocks Id System
/dev/hda1 * 0+ 0 1- 8001 89 Unknown
/dev/hda2 1 516 516 4144770 82 Linux swap
/dev/hda3 517 4864 4348 34925310 83 Linux
/dev/hda4 0 - 0 0 0 Empty
New situation:
Units = sectors of 512 bytes, counting from 0
Device Boot Start End #sectors Id System
/dev/hda1 * 63 16064 16002 89 Unknown
/dev/hda2 16065 8305604 8289540 82 Linux swap
/dev/hda3 8305605 78156224 69850620 83 Linux
/dev/hda4 0 - 0 0 Empty
Successfully wrote the new partition table
Re-reading the partition table ...
If you created or changed a DOS partition, /dev/foo7, say, then use dd(1)
to zero the first 512 bytes: dd if=/dev/zero of=/dev/foo7 bs=512 count=1
(See fdisk(8).)
The partition table on node 0 has been modified.
You must reboot each affected node for changes to take effect.
[root at abcmc02 fdisk]# beoboot-install 0 /dev/hda
Creating boot images...
Installing beoboot on partition 1 of /dev/hda.
mke2fs 1.32 (09-Nov-2002)
/dev/hda1: 11/2000 files (0.0% non-contiguous), 268/8001 blocks
Done
rcp: /boot/boot.b: No such file or directory
Failed to copy boot.b to node 0:/tmp/.beoboot-install.mnt
--------------------------------
I guess the problem is the beoboot-install utility. It didn't find the
/boot/boot.b file. Indeed, that file cannot be found in the master node.
Could this be a bug?
After rebooting, it came out with an ERROR state on the BeoSetup window.
Here's the log:
----------------
node_up: Initializing cluster node 0 at Wed Mar 9 15:44:55 EST 2005.
node_up: Setting system clock from the master.
node_up: Configuring loopback interface.
node_up: Loading device support modules for kernel version
2.4.27-294r0048.Scyldsmp.
setup_fs: Configuring node filesystems using /etc/beowulf/fstab...
setup_fs: Checking /dev/hda2 (type=swap)...
chkswap: /dev/hda2: Unable to find swap-space signature
setup_fs: FSCK failure. (OK for RAM disks)
setup_fs: Mounting /dev/hda2 on swap (type=swap; options=defaults)
swapon: /dev/hda2: Invalid argument
setup_fs: Failed to mount /dev/hda2 on swap (fatal).
---------------
thanks,
Jonas
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From ocschwar at MIT.EDU Thu Mar 10 00:07:01 2005
From: ocschwar at MIT.EDU (Omri Schwarz)
Date: Thu, 10 Mar 2005 00:07:01 -0500
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf cluster.
Message-ID: <200503100507.j2A571J1000565@zygorthian-space-raiders.mit.edu>
The Ageia.com physics processing unit's marketing literature seems
so oriented to gaming that I wonder if they would open their API
enough for people in our market niche could look at it and whether
it is suitable for putting in clusters. But if they did, what do y'all
think? Would specialty (albeit commodity) coprocessors hanging off a
PCI slot be suitable for your applications?
http://ageia.com
While I'm bringing this up, how about things like the MAP
processor?
http://www.srccomp.com/HardwareElements.htm#MAPProcessor
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From wangsj at yahoo.com Thu Mar 10 09:36:29 2005
From: wangsj at yahoo.com (Shih-Jon Wang)
Date: Thu, 10 Mar 2005 06:36:29 -0800 (PST)
Subject: [Beowulf] can't download beoqueue
Message-ID: <20050310143629.11741.qmail@web30207.mail.mud.yahoo.com>
Hi. Thomas. I can't download beoqueue... from the
following link. Could you please send me a copy of
it? Many Thanks!
SJ
http://www.weswulf.org/beoqueue.tar.gz
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From landman at scalableinformatics.com Thu Mar 10 10:23:17 2005
From: landman at scalableinformatics.com (Joe Landman)
Date: Thu, 10 Mar 2005 10:23:17 -0500
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf cluster.
In-Reply-To: <200503100507.j2A571J1000565@zygorthian-space-raiders.mit.edu>
References: <200503100507.j2A571J1000565@zygorthian-space-raiders.mit.edu>
Message-ID: <42306665.8020305@scalableinformatics.com>
Omri Schwarz wrote:
> The Ageia.com physics processing unit's marketing literature seems
> so oriented to gaming that I wonder if they would open their API
> enough for people in our market niche could look at it and whether
> it is suitable for putting in clusters. But if they did, what do y'all
> think? Would specialty (albeit commodity) coprocessors hanging off a
> PCI slot be suitable for your applications?
Some applications would do very well with application specific
processing systems.
>
> http://ageia.com
>
>
> While I'm bringing this up, how about things like the MAP
> processor?
>
> http://www.srccomp.com/HardwareElements.htm#MAPProcessor
Or any others.
Inverting the question, if you pay 4000$US per dual CPU compute node
(+/- a bit depending upon technology, config, supplier), what price (if
any) would you be willing to pay for an accelerator that offered you an
order of magnitude more performance per node, on your code, and sat in
the PCI-e/X or HTX slots? And also as important: how hard would you be
willing to work/how much effort committed to program these things? This
makes lots of assumptions, such as such a beast existing, your code
being mapped or mappable to it, and you being interested in this.
Part of what motivates this question are things like the Cray XD1 FPGA
board, or PathScale's processors (unless I misunderstood their
functions). Other folks have CPUs on a card of various sorts, ranging
from FPGA to DSPs. I am basically wondering aloud what sort of demand
for such technology might exist. I assume the answer starts with "if
the price is right" ... the question is what is that price, what are
the features/functionality, and how hard do people want to work on such
bits.
Note: As Jeff Layton pointed out many times, the GPUs in a number of
machines are being used by at least one group for CFD, so you can think
of these as a sort of dedicated attached processor. They are not
general purpose, but highly specialized computational pipelines. If you
could have a more general one, what would it look like, what would it
do/emphasize, and how much would it cost? I know there is no one
answer, but I thought it would be fun to extend Omri's question.
Curious.
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 734 786 8452
cell : +1 734 612 4615
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rgb at phy.duke.edu Thu Mar 10 12:19:11 2005
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Thu, 10 Mar 2005 12:19:11 -0500 (EST)
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf cluster.
In-Reply-To: <42306665.8020305@scalableinformatics.com>
References: <200503100507.j2A571J1000565@zygorthian-space-raiders.mit.edu>
<42306665.8020305@scalableinformatics.com>
Message-ID:
On Thu, 10 Mar 2005, Joe Landman wrote:
> Inverting the question, if you pay 4000$US per dual CPU compute node
> (+/- a bit depending upon technology, config, supplier), what price (if
> any) would you be willing to pay for an accelerator that offered you an
> order of magnitude more performance per node, on your code, and sat in
> the PCI-e/X or HTX slots? And also as important: how hard would you be
> willing to work/how much effort committed to program these things? This
> makes lots of assumptions, such as such a beast existing, your code
> being mapped or mappable to it, and you being interested in this.
>
> Part of what motivates this question are things like the Cray XD1 FPGA
> board, or PathScale's processors (unless I misunderstood their
> functions). Other folks have CPUs on a card of various sorts, ranging
> from FPGA to DSPs. I am basically wondering aloud what sort of demand
> for such technology might exist. I assume the answer starts with "if
> the price is right" ... the question is what is that price, what are
> the features/functionality, and how hard do people want to work on such
> bits.
>
> Note: As Jeff Layton pointed out many times, the GPUs in a number of
> machines are being used by at least one group for CFD, so you can think
> of these as a sort of dedicated attached processor. They are not
> general purpose, but highly specialized computational pipelines. If you
> could have a more general one, what would it look like, what would it
> do/emphasize, and how much would it cost? I know there is no one
> answer, but I thought it would be fun to extend Omri's question.
Problems with coprocessing solutions include:
a) Cost -- sometimes they are expensive, although they >>can<< yield
commensurate benefits for some code as you point out.
b) Availability -- I don't just mean whether or not vendors can get
them; I mean COTS vs non-COTS. They are frequently on-of-a-kind beasts
with a single manufacturer.
c) Usability. They typically require "special tools" to use them at
all. Cross-compilers, special libraries, code instrumentation. All of
these things require fairly major programming effort to implement in
your code to realize the speedup, and tend to decrease the
general-purpose portability of the result, tying you even more tightly
(after investing all this effort) with the (probably one) manufacturer
of the add-on.
c) Continued Availability -- They also not infrequently disappear
without a trace (as "general purpose" coprocessors, not necessarily as
ASICs) within a year or so of being released and marketed. This is
because Moore's Law is brutal, and even if a co-processor DOES manage to
speed up your actual application (and not just a core loop that
comprises 70% of your actual application) by a factor of ten, that's at
most four or five years of ML advances. If your code has a base of 30%
or so that isn't sped up at all (fairly likely) then your application
runs maybe 2-3 times as fast at best and ML eats it in 1-3 years.
d) Support. Using the tools and processors effectively requires a
fair bit of knowledge, but there is usually a pitifully small set of
other implementers of the non-mainstream technology and no good
communications channels between them (with some exceptions, of course).
You're likely to be mostly on your own while trying to get the tools
installed, code written and debugged, and eventually made efficient. If
the tool or processor turns out to be "broken" for your purpose, you
aren't likely to get much help with this, either, as you're a fringe
market (again, with possible exceptions).
Each of these alter the naive cost-benefit estimate of "Gee it is 10x
faster in more core loop and only makes my system cost 2x as much".
Maybe it is 10x faster in the core loop that is 70% of your code, so
that now the application runs in 0.37x the original time (good, but now
has to be compared to perhaps 0.5x the time available from getting 2x as
many ordinary systems). Maybe it takes you four months to get the
cross-compiler installed and all your code ported and to then TWEAK the
code so it really DOES give you the touted 10x speedup for your core
loops, which may have to be reblocked and written using special
instructions, which then also necessitates revalidating the results (in
case bugs have crept in during the port). Maybe the company that made
the core DSP releases a new one in the meantime (they've got ML to
contend with as well) and it has a different instruction set, so that a
year from now when you want to expand the cluster you either
re-instrument all the code again or rely on warehoused chips of the old
variety. Maybe in 1 year dual core, 64 bit CPUs are released that
effectively double, then double again, what you can get out of COTS
systems at near constant cost and your 32 bit CPU plus coprocessor
suddenly is slower, less portable, AND more expensive.
Or not. Maybe it speeds things up 10x, costs only 2x, will be available
for at least 3 more years, has a user base with hundreds of users and a
dedicated mailing list, has commercial or open source compiler support
that requires only minor tweaks or the use of standard library calls to
get most of the benefit, and is built to a standard so that four
companies make the actual chips, not just one.
I'm just reviewing the questions one would like to ask.
Anecdotally I'm reminded of e.g. the 8087, Micro Way's old transputer
sets (advertised in PC mag for decades), the i860 (IIRC), the CM-5, and
many other systems built over the years that tried to provide e.g. a
vector co-processor in parallel with a regular general purpose CPU,
sometimes on the same motherboard and bus, sometimes on daughterboards
or even on little mini-network connections hung off the bus somehow.
None of these really caught on (except for the 8087, and it is an
exercise for the studio audience as to why an add-on processor that
really should have been a part of the original processor itself, made by
the mfr of the actual crippled CPU from the beginning, succeeded),
although nearly all of them were used by at least a few intrepid
individuals to great benefit. Allowing that Nature is efficient in its
process of natural selection, this seems like a genetic/memetic
variation that generally lacks the CBA advantages required to make it a
real success.
rgb
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From landman at scalableinformatics.com Thu Mar 10 12:57:29 2005
From: landman at scalableinformatics.com (Joe Landman)
Date: Thu, 10 Mar 2005 12:57:29 -0500
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf cluster.
In-Reply-To:
References: <200503100507.j2A571J1000565@zygorthian-space-raiders.mit.edu>
<42306665.8020305@scalableinformatics.com>
Message-ID: <42308A89.5060907@scalableinformatics.com>
Robert G. Brown wrote:
> On Thu, 10 Mar 2005, Joe Landman wrote:
[...]
> Problems with coprocessing solutions include:
>
> a) Cost -- sometimes they are expensive, although they >>can<< yield
> commensurate benefits for some code as you point out.
I am imagining a co-processor in the 1/10 x -> 4x range compared to node
cost: A graphics card is the prototype in mind.
>
> b) Availability -- I don't just mean whether or not vendors can get
> them; I mean COTS vs non-COTS. They are frequently on-of-a-kind beasts
> with a single manufacturer.
This is an issue with almost anything except CPUs, where we have 2 (or 3
if you include PPC) manufacturers.
> c) Usability. They typically require "special tools" to use them at
> all. Cross-compilers, special libraries, code instrumentation.
TANSTAAFL. The idea is that the cost to do the "port" has to be low (to
zero).
> All of
> these things require fairly major programming effort to implement in
> your code to realize the speedup, and tend to decrease the
> general-purpose portability of the result, tying you even more tightly
> (after investing all this effort) with the (probably one) manufacturer
> of the add-on.
>
> c) Continued Availability -- They also not infrequently disappear
> without a trace (as "general purpose" coprocessors, not necessarily as
> ASICs) within a year or so of being released and marketed. This is
> because Moore's Law is brutal, and even if a co-processor DOES manage to
> speed up your actual application (and not just a core loop that
> comprises 70% of your actual application) by a factor of ten, that's at
> most four or five years of ML advances. If your code has a base of 30%
> or so that isn't sped up at all (fairly likely) then your application
> runs maybe 2-3 times as fast at best and ML eats it in 1-3 years.
And this is why the pricing issue is important. At what point does it
make economic sense to buy a coprocessor? In the case of graphics
cards, the coprocessor has amazing economies of scale (and it needs it).
You need similar economies of scale for a coprocessor system, which is
why I think the cost should be similar to the node cost (like existing
graphics cards at 1/10 to 4x node cost).
>
> d) Support. Using the tools and processors effectively requires a
> fair bit of knowledge, but there is usually a pitifully small set of
> other implementers of the non-mainstream technology and no good
> communications channels between them (with some exceptions, of course).
Hmmm. OpenGL uses C/C++/Fortran bindings to get at the power (at least
I think there is a way to call GL from fortran). What I was thinking
was a high level (C/Fortran/C++) interface to them ala OpenGL. Jeff
Layton if you are around, what is the name of that compiler set for the
GPUs? Brook? Something like that.
> You're likely to be mostly on your own while trying to get the tools
> installed, code written and debugged, and eventually made efficient. If
> the tool or processor turns out to be "broken" for your purpose, you
> aren't likely to get much help with this, either, as you're a fringe
> market (again, with possible exceptions).
>
> Each of these alter the naive cost-benefit estimate of "Gee it is 10x
> faster in more core loop and only makes my system cost 2x as much".
>
> Maybe it is 10x faster in the core loop that is 70% of your code, so
> that now the application runs in 0.37x the original time (good, but now
> has to be compared to perhaps 0.5x the time available from getting 2x as
> many ordinary systems). Maybe it takes you four months to get the
> cross-compiler installed and all your code ported and to then TWEAK the
> code so it really DOES give you the touted 10x speedup for your core
> loops, which may have to be reblocked and written using special
> instructions, which then also necessitates revalidating the results (in
> case bugs have crept in during the port). Maybe the company that made
> the core DSP releases a new one in the meantime (they've got ML to
> contend with as well) and it has a different instruction set, so that a
> year from now when you want to expand the cluster you either
> re-instrument all the code again or rely on warehoused chips of the old
> variety.
Again, I point to OpenGL as a prototypical interface for this. The
underlying driver may change, but the interface is effectively constant
to the programmer, regardless of how many pixel shaders exist in the
pipeline.
> Maybe in 1 year dual core, 64 bit CPUs are released that
> effectively double, then double again, what you can get out of COTS
> systems at near constant cost and your 32 bit CPU plus coprocessor
> suddenly is slower, less portable, AND more expensive.
Well, this has happened in the GPU market, and the GPUs have tracked
with ML. This is an issue for anyone committing to any computer of any
sort. ML is ML and it is going to drop the value of what you purchase
very quickly.
>
> Or not. Maybe it speeds things up 10x, costs only 2x, will be available
> for at least 3 more years, has a user base with hundreds of users and a
> dedicated mailing list, has commercial or open source compiler support
> that requires only minor tweaks or the use of standard library calls to
> get most of the benefit, and is built to a standard so that four
> companies make the actual chips, not just one.
And an interface layer that masks the chip differences, so that when
chips are changed out, the programs need not change (like OpenGL),
though they can to take advantage of great new feature X (an additional
MAC layer in the pipeline).
>
> I'm just reviewing the questions one would like to ask.
>
> Anecdotally I'm reminded of e.g. the 8087, Micro Way's old transputer
> sets (advertised in PC mag for decades), the i860 (IIRC), the CM-5, and
> many other systems built over the years that tried to provide e.g. a
> vector co-processor in parallel with a regular general purpose CPU,
> sometimes on the same motherboard and bus, sometimes on daughterboards
> or even on little mini-network connections hung off the bus somehow.
>
> None of these really caught on (except for the 8087, and it is an
> exercise for the studio audience as to why an add-on processor that
> really should have been a part of the original processor itself, made by
> the mfr of the actual crippled CPU from the beginning, succeeded),
> although nearly all of them were used by at least a few intrepid
> individuals to great benefit. Allowing that Nature is efficient in its
> process of natural selection, this seems like a genetic/memetic
> variation that generally lacks the CBA advantages required to make it a
> real success.
So there is an expression that I like attributing to myself, but I may
have "borrowed" it from elsewhere.
Something designed to fail often will.
The "general purpose" accelerator cards (transputer, NS32032, ...) all
suffered from a lack of application focus among other things. There was
the prevalent attitude of "if you build it, then they will buy". These
units largely failed to take hold apart from tiny niches.
OTOH, "specialized" accelerator cards (Graphics cards, RAID cards, Sound
cards) have been a smashing success, as the CBA makes sense, they
deliver a specific value, and they are easy to use. The take home
message is that any accelerator card needs to do the same. What these
accelerator cards do is offload work from the CPU. Not all of the will
work as businesses, and this isn't a magical formula for success.
Moreover, the "specialized" GPUs seem to have applicability in CFD and
other areas. This is interesting as it opens a possibility for
significant acceleration of some computations. They fundamental
question is whether or not there will be wide adoption. I am not seeing
wide adoption of the GPU as a CFD engine right now, but what if you had
a "CFD engine" chip that cost about the same as the GPU, stuck it on a
card, and had a high level language interface to it, so you hand it your
expensive routines to crank on.
The physics chip bit got me thinking along the molecular dynamics lines
last night, specifically the non-bonded calculations. I am sure others
could regail us with their computational burdens (and I would like to
hear them myself at some point in time, it is quite instructive to hear
what people are worrying about).
I think the physics chip in hardware is a neat idea, though I think you
need a high level interface to it, open standards, and lots of support
to make it work. Moreover, it needs to be programmable: not because
physics changes so often, but because the implied models may differ from
what you want.
As I said, I am curious, and I think it is an interesting idea. If done
right, with the wind at the right angles, good user/community support, I
think it could work :)
>
> rgb
>
--
joe
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From laytonjb at charter.net Thu Mar 10 13:22:52 2005
From: laytonjb at charter.net (Jeffrey B. Layton)
Date: Thu, 10 Mar 2005 13:22:52 -0500
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf cluster.
In-Reply-To: <42308A89.5060907@scalableinformatics.com>
References: <200503100507.j2A571J1000565@zygorthian-space-raiders.mit.edu> <42306665.8020305@scalableinformatics.com>
<42308A89.5060907@scalableinformatics.com>
Message-ID: <4230907C.6090007@charter.net>
Joe Landman wrote:
> Hmmm. OpenGL uses C/C++/Fortran bindings to get at the power (at
> least I think there is a way to call GL from fortran). What I was
> thinking was a high level (C/Fortran/C++) interface to them ala
> OpenGL. Jeff Layton if you are around, what is the name of that
> compiler set for the GPUs? Brook? Something like that.
The code is BrookGPU:
http://graphics.stanford.edu/projects/brookgpu/
This is derived from the Merrimac project at Stanford:
http://merrimac.stanford.edu/
There is also some other tools. For example, Sh:
http://libsh.org
I mentioned these in the ClusterWatch column in the March
2005 issue.
Jeff
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From mathog at mendel.bio.caltech.edu Thu Mar 10 14:13:18 2005
From: mathog at mendel.bio.caltech.edu (David Mathog)
Date: Thu, 10 Mar 2005 11:13:18 -0800
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf cluster.
Message-ID:
"Omri Schwarz" wrote:
> Would specialty (albeit commodity) coprocessors hanging off a
> PCI slot be suitable for your applications?
Well, a couple of points:
1. There may very well be no PCI slot available for such a
coprocessor. Forget about it for a 1U, and for a 2U the
available slots may already be used up by a small graphics card,
myrinet, hardware monitor etc. An open slot may not be sufficient,
there are almost certainly going to be space and cooling problems.
(I can't count the number of times an empty slot had to be
left between cards on PCs I've worked on.)
2. Assuming that you can get the card in there and not fry anything
you're going to have to at the very least rewrite the code to make
use of the specialized hardware. Unless this board comes with a
very, very clever compiler to automagically detect the bits it can
do best you're looking at some serious time and/or money spent on
programming.
3. Where's the market to pay for this? Perhaps some sort of
specialized rendering engine might be able to sell enough units to
the folks who make CGI movies. Similarly, a specialized FFT
engine might find a home in many places. The physics engine that
started this thread might be of some use to, well, physicists.
Or maybe not, I'm going to guess that the "physics" it
implements may not work so well when scaled up to multi-lightyear
distances or down to the point where quantum mechanics is important.
Regards,
David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From James.P.Lux at jpl.nasa.gov Thu Mar 10 14:48:19 2005
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Thu, 10 Mar 2005 11:48:19 -0800
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf
cluster.
In-Reply-To:
References: <200503100507.j2A571J1000565@zygorthian-space-raiders.mit.edu>
<42306665.8020305@scalableinformatics.com>
Message-ID: <6.1.1.1.2.20050310112134.02826ff8@mail.jpl.nasa.gov>
At 09:19 AM 3/10/2005, Robert G. Brown wrote:
>On Thu, 10 Mar 2005, Joe Landman wrote:
>
> >
> > Part of what motivates this question are things like the Cray XD1 FPGA
> > board, or PathScale's processors (unless I misunderstood their
> > functions). Other folks have CPUs on a card of various sorts, ranging
> > from FPGA to DSPs. I am basically wondering aloud what sort of demand
> > for such technology might exist. I assume the answer starts with "if
> > the price is right" ... the question is what is that price, what are
> > the features/functionality, and how hard do people want to work on such
> > bits.
>
>Problems with coprocessing solutions include:
>
> a) Cost -- sometimes they are expensive, although they >>can<< yield
>commensurate benefits for some code as you point out.
>
> b) Availability -- I don't just mean whether or not vendors can get
>them; I mean COTS vs non-COTS. They are frequently on-of-a-kind beasts
>with a single manufacturer.
Definitely an issue.
> c) Usability. They typically require "special tools" to use them at
>all. Cross-compilers, special libraries, code instrumentation. All of
>these things require fairly major programming effort to implement in
>your code to realize the speedup, and tend to decrease the
>general-purpose portability of the result, tying you even more tightly
>(after investing all this effort) with the (probably one) manufacturer
>of the add-on.
To a certain extent, though, this is being mitigated by things like Signal
Processing Workbench or Matlab, which have "plug ins" to convert generic
algorithm descriptions (i.e. simulink models, etc.) into runnable code on
the coprocessor or FPGA.
As far as product lock-in goes, "in theory" one could just recompile for a
new target processor, although I don't know if anyone's ever done this.
It does greatly reduce the "time and cost to demonstrate capability"
> c) Continued Availability -- They also not infrequently disappear
>without a trace (as "general purpose" coprocessors, not necessarily as
>ASICs) within a year or so of being released and marketed. This is
>because Moore's Law is brutal, and even if a co-processor DOES manage to
>speed up your actual application (and not just a core loop that
>comprises 70% of your actual application) by a factor of ten, that's at
>most four or five years of ML advances. If your code has a base of 30%
>or so that isn't sped up at all (fairly likely) then your application
>runs maybe 2-3 times as fast at best and ML eats it in 1-3 years.
There are specialized applications, lending themselves to clusters, for
which this might not hold. If we look at Xilinx FPGAs, for instance, while
not quite doubling every 18 months, they ARE dramatically increasing in
speed and size fairly quickly. And, it's not hugely difficult to take a
design that ran at speed X on size Y Xilinx FPGA and port it to speed A on
Size B Xilinx FPGA.
Consider a classic big crunching ASIC/FPGA application, that of running
many correlators in parallel to demodulate very faint signals buried in
noise (specifically, raw data coming back from deep space probes), or some
applications in radio astronomy. In the latter case, particularly, there's
a lot of interest in taking an array of radio telescopes and simultaneously
forming many beams, so you can look lots of directions at once, to look for
transient events that are "interesting" (like supernovae). The radio
astronomy community is relatively poor (Paul Allen's interest
notwithstanding), so they've got an incentive to use cheap commodity
processing for their needs, but off the shelf PCs might not hack
it. They're looking at a lot of architectures that strongly resemble the
usual cluster... data from all antennas streams into a raft of processors
via ethernet, and each processor forms some subset of beams either in space
or frequency. They might have a coprocessor card in the machine that does
some of the early really intensive beamforming computation.
Take a look at the Allen Telescope Array or at the Square Kilometer Array
or at LOFAR.
>Anecdotally I'm reminded of e.g. the 8087, Micro Way's old transputer
>sets (advertised in PC mag for decades), the i860 (IIRC), the CM-5, and
>many other systems built over the years that tried to provide e.g. a
>vector co-processor in parallel with a regular general purpose CPU,
>sometimes on the same motherboard and bus, sometimes on daughterboards
>or even on little mini-network connections hung off the bus somehow.
>
>None of these really caught on (except for the 8087, and it is an
>exercise for the studio audience as to why an add-on processor that
>really should have been a part of the original processor itself, made by
>the mfr of the actual crippled CPU from the beginning, succeeded),
THat's pretty easy. In the good old days, you had an integer CPU and an
add on FPU in almost all architectures. The FPU didn't have instruction
decoding, sequencing, or anything like that.. more like an extra ALU that
tied to the internal bus. Just like having memory management in a separate
chip. Intel and Motorola both used this approach. Intel did start to
integrate the MMU into the chip with "segment registers" on the 8086,
except that it provided zip, zero, none, nada memory protection. This was
part of a strategy to keep the codebase compatible with the 8080. After
all, who in their right mind would write a program bigger than 64K.. the
user application code would never look at the segment registers, which
would be managed by a multitasking OS. Think of it as integrated "bank
switching", which was quite popular in the 8bit processor world (and
itself, an outgrowth of how PDP-11 memory mangement worked)
It wasn't until the 80286 that it started to be some more sophistication,
and really, it was the 386 that made decent memory management possible.
Moto started with a virtual memory scheme and paging, and so became the
darling of software folks who had come to expect such things from the
PDP-11, DEC-10, DG, and even mainframe world.
In any case, NONE of them could have fit the FPU on the die and had decent
yields. Besides, you're talking processors that cost $200-400 (in 1980s)
and processors with integrated FPUs would have cost upwards of $1K-$1.5K
(because of the lower yield). As fab technology advanced, you could either
build bigger faster processors (in the separate CPU/FPU model) or you could
build integrated processors at the same slow speed.
Even today, I'd venture to guess that the vast number of CPU cycles spent
on PCs are integer mode computations (bitblts and the like to make windows
work). It's not like you need FP to do Word or PowerPoint, or even
Excel. It's rendered 3D graphics that really drives FP performance in the
consumer market.
This drives an interesting battle between the graphics ASIC makers (so that
an add on card can do the rendering) and the CPU makers (who want to put it
onboard, so that the total system cost is less), and, as well the support
provided by MS Windows to use either one effectively. The game market
clearly doesn't want to have to try and support ALL the possible graphics
cards out there (it was a nightmare trying to write high performance
graphics applications back in the late 80's, early 90s. The few skilled
folks who were good at it earned their shekels.)
>although nearly all of them were used by at least a few intrepid
>individuals to great benefit. Allowing that Nature is efficient in its
>process of natural selection, this seems like a genetic/memetic
>variation that generally lacks the CBA advantages required to make it a
>real success.
>
> rgb
James Lux, P.E.
Spacecraft Radio Frequency Subsystems Group
Flight Communications Systems Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From ocschwar at MIT.EDU Thu Mar 10 11:13:50 2005
From: ocschwar at MIT.EDU (Omri Schwarz)
Date: Thu, 10 Mar 2005 11:13:50 -0500 (EST)
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf cluster.
In-Reply-To: <42306665.8020305@scalableinformatics.com>
References: <200503100507.j2A571J1000565@zygorthian-space-raiders.mit.edu>
<42306665.8020305@scalableinformatics.com>
Message-ID:
On Thu, 10 Mar 2005, Joe Landman wrote:
> > http://ageia.com
> >
> >
> > While I'm bringing this up, how about things like the MAP
> > processor?
> >
> > http://www.srccomp.com/HardwareElements.htm#MAPProcessor
>
> Or any others.
>
> Inverting the question, if you pay 4000$US per dual CPU compute node
> (+/- a bit depending upon technology, config, supplier), what price (if
> any) would you be willing to pay for an accelerator that offered you an
> order of magnitude more performance per node, on your code, and sat in
> the PCI-e/X or HTX slots? And also as important: how hard would you be
> willing to work/how much effort committed to program these things? This
> makes lots of assumptions, such as such a beast existing, your code
> being mapped or mappable to it, and you being interested in this.
One might presume that if a piece of kit becomes known as attractive
to our community there would be a port of BLAS, LAPACK, FFTW and so
on written for it in very short order.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From redboots at ufl.edu Thu Mar 10 15:00:23 2005
From: redboots at ufl.edu (Paul Johnson)
Date: Thu, 10 Mar 2005 15:00:23 -0500
Subject: [Beowulf] hpl - large problems fail
In-Reply-To: <1110402646.5643.127.camel@Vigor11>
References:
<1110402646.5643.127.camel@Vigor11>
Message-ID: <4230A757.1020203@ufl.edu>
All:
I have a 4 node cluster(dont snicker :) ) and Im trying to do some
benchmarking with HPL. I want to test 2 of the nodes with 1Gb of
ram each. I calculated the maximum problem size that can fit in 2Gb
and still allow for memory for the operating system. That came out to
be around 14500x14500. When I run that size of a test it always fails.
The largest problem that I can test and not have it fail on me is
12500x12500.
What is the reason behind this? Im confused on what is going on here.
Thanks for any help.
Regards,
Paul
--
Paul Johnson
Graduate Student - Mechanical Engineering
University of Florida - Gainesville, Fl
http://plaza.ufl.edu/redboots
Reclaim Your Inbox!
http://www.mozilla.org/products/thunderbird
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From ocschwar at MIT.EDU Thu Mar 10 13:12:50 2005
From: ocschwar at MIT.EDU (Omri Schwarz)
Date: Thu, 10 Mar 2005 13:12:50 -0500 (EST)
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf cluster.
In-Reply-To:
References: <200503100507.j2A571J1000565@zygorthian-space-raiders.mit.edu>
<42306665.8020305@scalableinformatics.com>
Message-ID:
On Thu, 10 Mar 2005, Robert G. Brown wrote:
> On Thu, 10 Mar 2005, Joe Landman wrote:
>
> Problems with coprocessing solutions include:
>
> a) Cost -- sometimes they are expensive, although they >>can<< yield
> commensurate benefits for some code as you point out.
>
The COTS variants of coprocessor products are driven by market
demand for making Lara Croft look more like Angelina Jolie.
For us, that's good. It means the price might be right.
> b) Availability -- I don't just mean whether or not vendors can get
> them; I mean COTS vs non-COTS. They are frequently on-of-a-kind beasts
> with a single manufacturer.
Alas, the same market interaction above is driving prices down but
puts pressure on these products having proprietary interfaces.
Electronic Arts, Inc (17 days without a fatal case of karoshi!)
would certainly like it that way.
> c) Usability. They typically require "special tools" to use them at
> all. Cross-compilers, special libraries, code instrumentation. All of
> these things require fairly major programming effort to implement in
> your code to realize the speedup, and tend to decrease the
> general-purpose portability of the result, tying you even more tightly
> (after investing all this effort) with the (probably one) manufacturer
> of the add-on.
But! If you're not an overly bespoke Beowulf user,
i.e. you use linear algebra packages, DSP, or so on and so forth,
and you devote the time to implement a library on such hardware,
you instantly have a community of people with similar needs,
an ability to help you, and your vendor now has a market that
might be worth paying attention to.
SGI seems to think so as far as GPUs go.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From agshew at gmail.com Thu Mar 10 14:04:19 2005
From: agshew at gmail.com (Andrew Shewmaker)
Date: Thu, 10 Mar 2005 12:04:19 -0700
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf cluster.
In-Reply-To: <42308A89.5060907@scalableinformatics.com>
References: <200503100507.j2A571J1000565@zygorthian-space-raiders.mit.edu>
<42306665.8020305@scalableinformatics.com>
<42308A89.5060907@scalableinformatics.com>
Message-ID:
On Thu, 10 Mar 2005 12:57:29 -0500, Joe Landman
wrote:
> I think the physics chip in hardware is a neat idea, though I think you
> need a high level interface to it, open standards, and lots of support
> to make it work. Moreover, it needs to be programmable: not because
> physics changes so often, but because the implied models may differ from
> what you want.
I was interested in whether they were supporting Linux with their SDK [1].
Here's what I found:
Their SDK is unsurprisingly MS centric, but it is built on something called the
Open Dynamics Framework [2]. They don't have a Linux/OpenGL port yet,
but it looks like they have been designing it so they easily can, and they say
they probably will in the future. They are using C++ and provide Lua bindings
for rapid prototyping. Their PSCL (Physics Scripting Language) documentation
references the ODE (Open Dynamics Engine) [3], but I'm not quite sure how
they fit together other than they collaborated on PSCL. It looks like ODE does
currently run on Linux.
[1] http://www.ageia.com/novodex_downloads.html
[2] http://physicstools.org/forum1/
[3] http://ode.org/
--
Andrew Shewmaker
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From krystyx at acd.net Thu Mar 10 12:14:33 2005
From: krystyx at acd.net (Krys Kaya-Sar)
Date: Thu, 10 Mar 2005 12:14:33 -0500
Subject: [Beowulf] 2.6.11 is out; with InfiBand support
In-Reply-To:
Message-ID:
YMMV? my apologies, very new here.
-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On
Behalf Of Don Holmgren
Sent: Friday, March 04, 2005 13:44
To: Roger L. Smith
Cc: Joe Landman; Beowulf at beowulf.org; Mark Hahn
Subject: Re: [Beowulf] 2.6.11 is out; with InfiBand support
I've made two purchases in the last 12 months of 24-port switches.
Two switches last April came in at ~ $4000 each.
16 switches last Sept came in at ~ $3300 each.
These were two different brands of switches, both based on the Mellanox
Infiniscale III (24 port crossbar) silicon.
Clearly YMMV on pricing.
Don Holmgren
On Fri, 4 Mar 2005, Roger L. Smith wrote:
>
>
> THe price I stated was for a 24 port switch for around $8,000 list. As a
> matter of fact, I just confirmed this with the vendor.
>
> This does not include cables or HCAs.
>
> On Fri, 4 Mar 2005, Jeffrey B. Layton wrote:
>
> > 8 ports under 8k, but it was a 24 port switch :)
> > This includes all of the HCA's, switches (only one),
> > cables, and software.
> >
> > Jeff
> >
> > > 8 ports under 8k or 24 ports under 8k?
> > >
> > > Jeffrey B. Layton wrote:
> > >
> > >> However, to match what Roger said, one IB vendor gave me
> > >> a list price for 8-ports of IB for under $8,000.
> > >
> >
>
>
> _\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_\|/_
> | Roger L. Smith Phone: 662-325-3625
|
> | Sr. Systems Administrator FAX: 662-325-7692
|
> | roger at ERC.MsState.Edu http://WWW.ERC.MsState.Edu/~roger
|
> | Mississippi State University
|
>
|____________________________________ERC__________________________________|
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
>
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rgb at phy.duke.edu Thu Mar 10 15:29:11 2005
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Thu, 10 Mar 2005 15:29:11 -0500 (EST)
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf cluster.
In-Reply-To: <42308A89.5060907@scalableinformatics.com>
References: <200503100507.j2A571J1000565@zygorthian-space-raiders.mit.edu>
<42306665.8020305@scalableinformatics.com>
<42308A89.5060907@scalableinformatics.com>
Message-ID:
On Thu, 10 Mar 2005, Joe Landman wrote:
> So there is an expression that I like attributing to myself, but I may
> have "borrowed" it from elsewhere.
>
> Something designed to fail often will.
>
> The "general purpose" accelerator cards (transputer, NS32032, ...) all
> suffered from a lack of application focus among other things. There was
> the prevalent attitude of "if you build it, then they will buy". These
> units largely failed to take hold apart from tiny niches.
>
> OTOH, "specialized" accelerator cards (Graphics cards, RAID cards, Sound
> cards) have been a smashing success, as the CBA makes sense, they
> deliver a specific value, and they are easy to use. The take home
> message is that any accelerator card needs to do the same. What these
> accelerator cards do is offload work from the CPU. Not all of the will
> work as businesses, and this isn't a magical formula for success.
And you have the volume issue. Offhand, I can easily think of at least
a few HPC coprocessor cards that might be useful in a cluster:
A $30 PCI-bus card that does nothing but generate super-high-quality
random numbers (uniform deviates of various widths and/or ints of
various widths) at high speed (faster than the CPU can, which means say
50 megarands/second or better) and deliver them directly to memory
without the CPUs help (so one can build a circular queue and keep it
full with only occasional calls requesting the next block of rands)
would be a Great Boon to Monte Carlo-heads like myself.
A $30 linear algebra card. Yes, I know -- a lot of graphics cards are
essentially vector processors and can be used in this way, but I'm not
satisfied. These cards aren't DESIGNED to be used as general purpose
LAPACK-like or BLAS-like engines. I can't help but think that one could
design a set of such cards that would function like a little
mini-cluster even within a single system, partitioning the problem and
doing sub-blocks, all in parallel with the main CPU and working directly
with memory.
There are probably more, but random numbers and linear algebra are both
major components of a lot of work. Look at the problem here. Your $30
graphics chip is used in tens of millions of units per year. Your $30
random number generator card a) has to "work", which is not trivial to
arrange. I have a bitwise random number test in dieharder (a GPL
package for testing random numbers I'm working on) that every supposedly
rng in the GSL fails at six bits, most still fail at five bits, and
quite a few fail at four. That is, forget uniform distribution of
BYTES, let alone 4 bytes sequences -- there are measureable deviations
away from random for just 6 bit substrings of a long string of bits.
Hardware rng's are often no better. b) you have to be able to sell
enough to make money, and that will be tough at $30/card...
Ditto for linear algebra, although there there is a high end market and
companies DO sell engines for a lot more than $30 to a very small
market. And make money.
We just want the best of both worlds...
>
> Moreover, the "specialized" GPUs seem to have applicability in CFD and
> other areas. This is interesting as it opens a possibility for
> significant acceleration of some computations. They fundamental
> question is whether or not there will be wide adoption. I am not seeing
> wide adoption of the GPU as a CFD engine right now, but what if you had
> a "CFD engine" chip that cost about the same as the GPU, stuck it on a
> card, and had a high level language interface to it, so you hand it your
> expensive routines to crank on.
>
> The physics chip bit got me thinking along the molecular dynamics lines
> last night, specifically the non-bonded calculations. I am sure others
> could regail us with their computational burdens (and I would like to
> hear them myself at some point in time, it is quite instructive to hear
> what people are worrying about).
Ya, stuff like this would be great -- ODE solvers on a chip or add-on
card. But NOT easy to build and NOT that big a market.
rgb
>
>
> I think the physics chip in hardware is a neat idea, though I think you
> need a high level interface to it, open standards, and lots of support
> to make it work. Moreover, it needs to be programmable: not because
> physics changes so often, but because the implied models may differ from
> what you want.
>
> As I said, I am curious, and I think it is an interesting idea. If done
> right, with the wind at the right angles, good user/community support, I
> think it could work :)
>
>
>
>
> >
> > rgb
> >
>
>
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From Bogdan.Costescu at iwr.uni-heidelberg.de Thu Mar 10 15:38:15 2005
From: Bogdan.Costescu at iwr.uni-heidelberg.de (Bogdan Costescu)
Date: Thu, 10 Mar 2005 21:38:15 +0100 (CET)
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf cluster.
In-Reply-To: <42308A89.5060907@scalableinformatics.com>
Message-ID:
On Thu, 10 Mar 2005, Joe Landman wrote:
> The physics chip bit got me thinking along the molecular dynamics lines
> last night, specifically the non-bonded calculations.
http://www.research.ibm.com/grape/
is already quite old... CHARMM and I think AMBER (as MD applications)
were able to use the chips - given that these are 2 of the most used
MD codes, it would suggest that there was a significant gain in speed
to justify the extra coding.
--
Bogdan Costescu
IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu at IWR.Uni-Heidelberg.De
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From gmpc at sanger.ac.uk Thu Mar 10 17:11:37 2005
From: gmpc at sanger.ac.uk (Guy Coates)
Date: Thu, 10 Mar 2005 22:11:37 +0000 (GMT)
Subject: [Beowulf] hpl - large problems fail
In-Reply-To: <4230A757.1020203@ufl.edu>
References:
<1110402646.5643.127.camel@Vigor11> <4230A757.1020203@ufl.edu>
Message-ID:
On Thu, 10 Mar 2005, Paul Johnson wrote:
> All:
>
> I have a 4 node cluster(dont snicker :) )
Everyone starts off small.
and Im trying to do some
> benchmarking with HPL. I want to test 2 of the nodes with 1Gb of
> ram each. I calculated the maximum problem size that can fit in 2Gb
> and still allow for memory for the operating system. That came out to
> be around 14500x14500. When I run that size of a test it always fails.
> The largest problem that I can test and not have it fail on me is
> 12500x12500.
> What is the reason behind this? Im confused on what is going on here.
> Thanks for any help.
Do you know what actually caused the failure?
If your problem size was too big, and you are really out of memory, you
should see some messages in the system log saying the out-of-memory-killer
was activated and HPL was zapped.
If you know your machines was not actually out of memory, then you have
broken hardware on one of your nodes. Run memtest+ or memtest on your
nodes (Possibly the world's most useful pieces of diagnostic software).
http://www.memtest86.com
http://www.memtest.org
If you haven't seen it, IBM have a redpaper on tuning HPL, which gives
some good starting parameters, problem-sizing tips and an overview of
different BLAS libraries you can compile against to get that extra few
Gflops of performance.
Cheers,
Guy
--
Dr. Guy Coates, Informatics System Group
The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK
Tel: +44 (0)1223 834244 ex 7199
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From ctierney at hpti.com Thu Mar 10 17:29:07 2005
From: ctierney at hpti.com (Craig Tierney)
Date: Thu, 10 Mar 2005 15:29:07 -0700
Subject: [Beowulf] hpl - large problems fail
In-Reply-To: <4230A757.1020203@ufl.edu>
References:
<1110402646.5643.127.camel@Vigor11> <4230A757.1020203@ufl.edu>
Message-ID: <1110493747.2973.75.camel@hpti10.fsl.noaa.gov>
On Thu, 2005-03-10 at 13:00, Paul Johnson wrote:
> All:
>
> I have a 4 node cluster(dont snicker :) ) and Im trying to do some
> benchmarking with HPL. I want to test 2 of the nodes with 1Gb of
> ram each. I calculated the maximum problem size that can fit in 2Gb
> and still allow for memory for the operating system. That came out to
> be around 14500x14500. When I run that size of a test it always fails.
> The largest problem that I can test and not have it fail on me is
> 12500x12500.
> What is the reason behind this? Im confused on what is going on here.
> Thanks for any help.
Run hpl on each node by itself, using all of the memory.
You probably have a bad stick of memory somewhere.
Craig
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From redboots at ufl.edu Thu Mar 10 17:56:16 2005
From: redboots at ufl.edu (Paul Johnson)
Date: Thu, 10 Mar 2005 17:56:16 -0500
Subject: Clarification: [Beowulf] hpl - large problems fail
In-Reply-To:
References:
<1110402646.5643.127.camel@Vigor11> <4230A757.1020203@ufl.edu>
Message-ID: <4230D090.4060301@ufl.edu>
Guy Coates wrote:
>On Thu, 10 Mar 2005, Paul Johnson wrote:
>
>
>
>>All:
>>
>>I have a 4 node cluster(dont snicker :) )
>>
>>
>
>Everyone starts off small.
>
>and Im trying to do some
>
>
>>benchmarking with HPL. I want to test 2 of the nodes with 1Gb of
>>ram each. I calculated the maximum problem size that can fit in 2Gb
>>and still allow for memory for the operating system. That came out to
>>be around 14500x14500. When I run that size of a test it always fails.
>>The largest problem that I can test and not have it fail on me is
>>12500x12500.
>>What is the reason behind this? Im confused on what is going on here.
>>Thanks for any help.
>>
>>
>
>
>Do you know what actually caused the failure?
>
>If your problem size was too big, and you are really out of memory, you
>should see some messages in the system log saying the out-of-memory-killer
>was activated and HPL was zapped.
>
>If you know your machines was not actually out of memory, then you have
>broken hardware on one of your nodes. Run memtest+ or memtest on your
>nodes (Possibly the world's most useful pieces of diagnostic software).
>
>http://www.memtest86.com
>http://www.memtest.org
>
>
>If you haven't seen it, IBM have a redpaper on tuning HPL, which gives
>some good starting parameters, problem-sizing tips and an overview of
>different BLAS libraries you can compile against to get that extra few
>Gflops of performance.
>
>Cheers,
>
>Guy
>
>
>
I should have been more clearer in my description. It doesn't fail at
the command prompt when I run it. It fails when it checks the solution
to linear equations. The residual is too high and fails. This is part
of the data from my HPL.out file:
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
WC12R2L4 14500 64 1 2 388.43 5.233e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 284363.4669186 ...... FAILED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 210262.3627204 ...... FAILED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 41377.6398965 ...... FAILED
||Ax-b||_oo . . . . . . . . . . . . . . . . . = 0.001692
||A||_oo . . . . . . . . . . . . . . . . . . . = 3708.772315
||A||_1 . . . . . . . . . . . . . . . . . . . = 3695.221759
||x||_oo . . . . . . . . . . . . . . . . . . . = 6.847285
||x||_1 . . . . . . . . . . . . . . . . . . . = 19610.120504
============================================================================
Sorry for the confusion,
Paul
--
Paul Johnson
Graduate Student - Mechanical Engineering
University of Florida - Gainesville, Fl
http://plaza.ufl.edu/redboots
Reclaim Your Inbox!
http://www.mozilla.org/products/thunderbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From mathog at mendel.bio.caltech.edu Thu Mar 10 19:22:39 2005
From: mathog at mendel.bio.caltech.edu (David Mathog)
Date: Thu, 10 Mar 2005 16:22:39 -0800
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf
Message-ID:
"Robert G. Brown" wrote
> A $30 PCI-bus card that does nothing but generate super-high-quality
> random numbers (uniform deviates of various widths and/or ints of
> various widths) at high speed (faster than the CPU can, which means say
> 50 megarands/second or better) and deliver them directly to memory
> without the CPUs help (so one can build a circular queue and keep it
> full with only occasional calls requesting the next block of rands)
> would be a Great Boon to Monte Carlo-heads like myself.
I've often thought the same. The thing is, it doesn't have to be
pseudorandom numbers, it can be real random numbers generated
by a physical process. For instance, put some radioactive
material on the card and count the decays observed in a
fixed interval. In that case the faster you want the random
numbers the hotter the source must be. Unfortunately a
Curie is "only" 37,000,000,000 decays per second. If you want
50 Mrands per second and use 1 Curie of material that's only
about 740 expected decays per interval, assuming that you can catch
them all, so not as many bits in those rands as you might want.
Also I'm pretty sure I don't want to be anywhere near a beowulf
with 1 Curie of radiation in each node! When I was considering this
it was just to generate a small number of random numbers at a much
slower rate, in which case radiation equivalent to a smoke alarm
would have sufficed.
It might be safer and a good deal more practical to use shot noise
instead. That could be integrated nicely onto a chip, with
thousands of little amplifiers and adders all plugging away
in parallel to generate your numbers.
And does it really have to be in a PCI slot? Why not build it
into memory, Then the numbers don't need to move across the bus
at all, any read from that stick will contain a random number.
Just be sure that said card has some way of passing the
POST sequence. Tricky doing both I suppose.
Regards,
David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rgb at phy.duke.edu Thu Mar 10 20:08:17 2005
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Thu, 10 Mar 2005 20:08:17 -0500 (EST)
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf
In-Reply-To:
References:
Message-ID:
On Thu, 10 Mar 2005, David Mathog wrote:
> "Robert G. Brown" wrote
>
> > A $30 PCI-bus card that does nothing but generate super-high-quality
> > random numbers (uniform deviates of various widths and/or ints of
> > various widths) at high speed (faster than the CPU can, which means say
> > 50 megarands/second or better) and deliver them directly to memory
> > without the CPUs help (so one can build a circular queue and keep it
> > full with only occasional calls requesting the next block of rands)
> > would be a Great Boon to Monte Carlo-heads like myself.
>
> I've often thought the same. The thing is, it doesn't have to be
> pseudorandom numbers, it can be real random numbers generated
> by a physical process. For instance, put some radioactive
> material on the card and count the decays observed in a
> fixed interval. In that case the faster you want the random
> numbers the hotter the source must be. Unfortunately a
> Curie is "only" 37,000,000,000 decays per second. If you want
> 50 Mrands per second and use 1 Curie of material that's only
> about 740 expected decays per interval, assuming that you can catch
> them all, so not as many bits in those rands as you might want.
> Also I'm pretty sure I don't want to be anywhere near a beowulf
> with 1 Curie of radiation in each node! When I was considering this
> it was just to generate a small number of random numbers at a much
> slower rate, in which case radiation equivalent to a smoke alarm
> would have sufficed.
Yeah, I've done all of these computations and looked into various
quantum devices and entropy based generators. However, you'd be
surprised how difficult it is to make a HARDWARE based random number
device that passes e.g. diehard. There is a deep truth here. "Random
Number Generator" really is an oxymoron in a universe governed by
natural law. Even quantum randomness arises from the fact that the
"system" generating it is an open one in contact with a universe/bath in
an unknown state (and watch out or I'll hit you up with my "Generalized
Master Equation" lecture...;-).
Unpredictability, you can find under every leaf. Randomness? Not so
easy.
> It might be safer and a good deal more practical to use shot noise
> instead. That could be integrated nicely onto a chip, with
> thousands of little amplifiers and adders all plugging away
> in parallel to generate your numbers.
Lots out them out there, but not so easy to make random enough at the
bit level to make diehard happy. Shot noise, thermal noise, photon
counters, more. They tend not to be cheap, not necessarily be
particularly random, and they are almost universally "slow as molasses"
compared to a good pseudorandom number generator.
>
> And does it really have to be in a PCI slot? Why not build it
> into memory, Then the numbers don't need to move across the bus
> at all, any read from that stick will contain a random number.
> Just be sure that said card has some way of passing the
> POST sequence. Tricky doing both I suppose.
Actually IIRC there was some effort by Intel to build something into a
CPU or some other part of a mobo chipset, but I don't know what came of
it. The people who need this commercially are webvolken and encryption
algorithms, where "unpredictable" is almost as good as "random to the
nth bit of randomness". Nobody uses marginally secure encryption if
they can help it. However, these folks are totally happy with
kilorands/sec, let alone megarands/sec. I need hundreds of
megarands/sec.
rgb
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From landman at scalableinformatics.com Thu Mar 10 20:48:42 2005
From: landman at scalableinformatics.com (Joe Landman)
Date: Thu, 10 Mar 2005 20:48:42 -0500
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf
In-Reply-To:
References:
Message-ID: <4230F8FA.9070405@scalableinformatics.com>
Robert G. Brown wrote:
> Actually IIRC there was some effort by Intel to build something into a
> CPU or some other part of a mobo chipset, but I don't know what came of
> it. The people who need this commercially are webvolken and encryption
> algorithms, where "unpredictable" is almost as good as "random to the
> nth bit of randomness". Nobody uses marginally secure encryption if
> they can help it. However, these folks are totally happy with
> kilorands/sec, let alone megarands/sec. I need hundreds of
> megarands/sec.
(thinking aloud)
Hmmm. I bet we could generate this (if you dont mind tt800 or its MT
ilk), in software using some really neat and very powerful (and
inexpensive) COTS chips. Dont know if we could hit 0.1 GRPS (gigarand
per second), but I bet it could come pretty close.
I have been thinking about a USB dongle for the past few months that
provided random ints at a pretty nice rate. The chip I have in mind for
this might be able to be powered off the USB port also (I think, I need
to see the overall power needs).
Could do a PCI card... would cost more to fabricate and sell. The $30
mark might be low. I'd guess closer to about $100 for that, unless we
could get significant volumes ... (economies of scale work wonders for
component pricing).
(/thinking aloud)
>
> rgb
>
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 734 786 8452
cell : +1 734 612 4615
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From natorro at fisica.unam.mx Thu Mar 10 20:32:07 2005
From: natorro at fisica.unam.mx (Carlos Lopez Nataren)
Date: Thu, 10 Mar 2005 19:32:07 -0600
Subject: [Beowulf] How to disable GUI in Mac OS X?
Message-ID: <1110504727.3267.1.camel@linux>
I'm doing some performance test with a G5 Xserve, both with linux and
MacOS X, but it seem unfair to test the performance of MacOS X with the
graphical interface up, does anyone know how to disable the graphical
interface so I just get a nice text console?
Thanks a lot in advance
Carlos
--
Carlos Lopez Nataren
Instituto de Fisica, UNAM
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From atp at piskorski.com Thu Mar 10 21:52:15 2005
From: atp at piskorski.com (Andrew Piskorski)
Date: Thu, 10 Mar 2005 21:52:15 -0500
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf cluster.
In-Reply-To: <200503100507.j2A571J1000565@zygorthian-space-raiders.mit.edu>
References: <200503100507.j2A571J1000565@zygorthian-space-raiders.mit.edu>
Message-ID: <20050311025215.GA88338@piskorski.com>
On Thu, Mar 10, 2005 at 12:07:01AM -0500, Omri Schwarz wrote:
> http://ageia.com
Interesting. Commodity - or at least "cheap commodity" - means its
primary market has to be something other than HPC, of course. 3D
video games qualify. So, the main questions I see are:
1. What kind of HPC stuff is this Ageia PhysX chip good for, and how
good?
2. How likely is it to get really, really widely used in non-HPC
fields, thus driving down price?
3. What other such candidate chips are out there? Same questions for
those.
4. What to do about it? (E.g., "Heh, Ageia, there's this little HPC
niche you could sell to with maybe no extra effort.")
Question 1 seems pretty open. Looks like they've released nearly
nothing about its actual hardware design and performance. Handles
40,000 rigid bodies, 0.13 um process, 125 million transistors, 25
Watts. That's about it so far.
On question 2, who knows. But, supposedly Sega has licensed the PhysX
chip. (Naturally, no word whether or not they will be shipping it in
millions of game consoles.) Some other googling suggests that at
least some big game developers are supporting for it (e.g., "the
Unreal Engine 3 thoroughly exploits the Novodex physics API").
So it doesn't seem totally impossible that either Ageia or some
competitor (say, the GPU companies) might succeed in creating a new
market niche. But it's all very early, AFAIK no one has even
announced an actual board with the chip yet, much less offered one for
sale. If I had to guess, they probably haven't fab'd (at TSMC) more
than a couple sample lots yet. The press is that they're shooting for
retail hardware on sale for Christmas.
What I find interesting, is if these sorts of highly realistic,
"render explosions in real-time" simulations catch on, even if this
PhysX chip turns out to be useless for anything other than this year's
latest first person shooter, its successors may not. Yet another
potential opportunity for cluster users to ride on the mass market
coattails...
--
Andrew Piskorski
http://www.piskorski.com/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rgb at phy.duke.edu Thu Mar 10 22:31:07 2005
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Thu, 10 Mar 2005 22:31:07 -0500 (EST)
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf
In-Reply-To: <4230F8FA.9070405@scalableinformatics.com>
References:
<4230F8FA.9070405@scalableinformatics.com>
Message-ID:
On Thu, 10 Mar 2005, Joe Landman wrote:
>
>
> Robert G. Brown wrote:
>
> > Actually IIRC there was some effort by Intel to build something into a
> > CPU or some other part of a mobo chipset, but I don't know what came of
> > it. The people who need this commercially are webvolken and encryption
> > algorithms, where "unpredictable" is almost as good as "random to the
> > nth bit of randomness". Nobody uses marginally secure encryption if
> > they can help it. However, these folks are totally happy with
> > kilorands/sec, let alone megarands/sec. I need hundreds of
> > megarands/sec.
>
> (thinking aloud)
>
> Hmmm. I bet we could generate this (if you dont mind tt800 or its MT
> ilk), in software using some really neat and very powerful (and
> inexpensive) COTS chips. Dont know if we could hit 0.1 GRPS (gigarand
> per second), but I bet it could come pretty close.
As I said, it isn't as easy as it sounds to generate truly random
numbers. Pseudorandom yes, but there I can generate immediate data
(from my dieharder program, for example) on how fast you can make them.
On a dual 242 Opteron, for example, using the mt19937_1999 generator
(one of the best, fastest rngs from the GSL) I get roughly 50 MegaRPS
per processor, or 100 MRPS total. Generators of comparable or a bit
better quality tend to run just a bit slower to a lot slower. The built
in /dev/random entropy generator in linux is VERY slow and blocks when
there is insufficient entropy. /dev/urandom doesn't block, is quite
unpredictable, but is STILL very, very slow at less than a MROPS on
nearly any modern hardware. This gives you some idea of how difficult
it is to get 100 MRPS rates on ANY sort of hardware, even with
pseudorandom number generators.
Hardware generators have the same issues. Radioactivity based
generators need to be pretty damn hot to generate 3.2 random gigabits
per second. Photon counters have all sorts of hardware response time
issues (and tend not to be as poissonian/random as you might think, see
Hanbury-Brown-Twiss or computations of antibunching in the fluorescent
spectrum). "Entropy" based generators can be OK, but again, it is
difficult to know all the timescales of decay/decorrelation of all
aspects of order, as nearly ALL hardware based sources are not
independent random for very short times -- rather you hope that they
have "only" relatively short autocorrelation times and don't sample them
too often.
This is one of the things I might be working on a bit over the next
year, and is one reason I started working on dieharder (the tool I use
both to measure their quality of randomness, so to speak, and to time
them as well).
> I have been thinking about a USB dongle for the past few months that
> provided random ints at a pretty nice rate. The chip I have in mind for
> this might be able to be powered off the USB port also (I think, I need
> to see the overall power needs).
>
> Could do a PCI card... would cost more to fabricate and sell. The $30
> mark might be low. I'd guess closer to about $100 for that, unless we
> could get significant volumes ... (economies of scale work wonders for
> component pricing).
FIRST you have to come up with a design, THEN you have to build a
prototype and test it. Chances are that when you test it you'll find
that while perhaps (if you're lucky) it is unpredictable, it isn't
"uniformly random" at the bit level. After all, plenty of things sample
a random DISTRIBUTION (say a Gaussian) that is unpredictable but not
uniform. Then you get to look for transformations that will take your
non-uniform random distribution and make it uniform at the bit level.
Typically this will cost you bits -- for example, if your original
distribution has a bias in its 0 vs 1 numbers you can combine pairs of
bits to create a distribution that is uniform in 0 and 1. Then you have
to look at 00, 01, 10, and 11 -- do they all occur binomially
distributed (on average) with p = 0.25? How about 000, 001, 010, 011,
100, 101, 110, 111 with p = 0.125? Etc. (I find that NO pseudorandom
number generators I've tested have the right six or greater bit
distribution, that is, 000000, 000001, 000010, ... are not each
binomially distributed with p = 1/64 in a very large bit sample.)
Eventually you get it to pass diehard (and maybe even dieharder) at the
expense of many bits and much consequent halving of the peak rates. If
it is STILL really fast -- 100 MROPS is a good thing to shoot for as it
is the equivalent of a dedicated dual opteron THEN you can look for a
bus and market.
Not easy at all. Or cheap. Intel and other companies have long been
looking for a good way of doing this, and if you find one (especially
one that could be incorporated on a single chip) you can probably sell
it directly to them and retire.
rgb
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From atp at piskorski.com Thu Mar 10 23:50:50 2005
From: atp at piskorski.com (Andrew Piskorski)
Date: Thu, 10 Mar 2005 23:50:50 -0500
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf
In-Reply-To:
References:
Message-ID: <20050311045050.GA31865@piskorski.com>
On Thu, Mar 10, 2005 at 08:08:17PM -0500, Robert G. Brown wrote:
> they can help it. However, these folks are totally happy with
> kilorands/sec, let alone megarands/sec. I need hundreds of
> megarands/sec.
The VIA x86 cpus have "PadLock ACE" RNG hardware, but they only claim
1.5 mbit/s or so. Lots slower than your dual Opterons pushing PRNG
code, but then, those little Vias are probably happy to get other work
done too while spitting out those random bits. :)
Some web links say that Diehard is happy with the Via's output. But
then I've also heard that Diehard wasn't intended (still true?) for
finding the type of non-randomness hardware generators are prone to:
http://www.robertnz.net/hwrng.htm
http://www.robertnz.net/true_rng.html
I wonder if the Via's underlying randomness sampling mechanism is
inherently rate limited, or if they could scale it up to a single
small RNG-only chip with 100 or 1000 times the bit rate. There does
seem to be some market for such a thing, so since they have not done
so, probably not?
--
Andrew Piskorski
http://www.piskorski.com/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From john.hearns at streamline-computing.com Fri Mar 11 03:02:11 2005
From: john.hearns at streamline-computing.com (John Hearns)
Date: Fri, 11 Mar 2005 08:02:11 +0000
Subject: [Beowulf] 2.6.11 is out; with InfiBand support
In-Reply-To:
References:
Message-ID: <1110528131.5349.1.camel@Vigor45>
On Thu, 2005-03-10 at 12:14 -0500, Krys Kaya-Sar wrote:
> YMMV? my apologies, very new here.
>
Your Mileage May Vary
If I'm not wrong, this comes from USA automobile advertisements.
The manufacturers give a certain petrol (gas) mileage for the car,
but warn the customer that the mileage they get may very from that.
Same with computers - you might not achieve the same results.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From eugen at leitl.org Fri Mar 11 03:03:14 2005
From: eugen at leitl.org (Eugen Leitl)
Date: Fri, 11 Mar 2005 09:03:14 +0100
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf
In-Reply-To:
References:
Message-ID: <20050311080314.GF17303@leitl.org>
On Thu, Mar 10, 2005 at 04:22:39PM -0800, David Mathog wrote:
> "Robert G. Brown" wrote
>
> > A $30 PCI-bus card that does nothing but generate super-high-quality
> > random numbers (uniform deviates of various widths and/or ints of
> > various widths) at high speed (faster than the CPU can, which means say
> > 50 megarands/second or better) and deliver them directly to memory
> > without the CPUs help (so one can build a circular queue and keep it
> > full with only occasional calls requesting the next block of rands)
> > would be a Great Boon to Monte Carlo-heads like myself.
>
> I've often thought the same. The thing is, it doesn't have to be
> pseudorandom numbers, it can be real random numbers generated
Some can do both.
http://www.via.com.tw/en/initiatives/padlock/hardware.jsp
Unfortunately, the VIA CPUs are otherwise next to useless for numerics.
--
Eugen* Leitl leitl
______________________________________________________________
ICBM: 48.07078, 11.61144 http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE
http://moleculardevices.org http://nanomachines.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL:
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From eugen at leitl.org Fri Mar 11 03:24:48 2005
From: eugen at leitl.org (Eugen Leitl)
Date: Fri, 11 Mar 2005 09:24:48 +0100
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf
In-Reply-To:
References:
Message-ID: <20050311082447.GG17303@leitl.org>
On Thu, Mar 10, 2005 at 08:08:17PM -0500, Robert G. Brown wrote:
> Yeah, I've done all of these computations and looked into various
> quantum devices and entropy based generators. However, you'd be
> surprised how difficult it is to make a HARDWARE based random number
> device that passes e.g. diehard. There is a deep truth here. "Random
http://www.via.com.tw/en/downloads/whitepapers/initiatives/padlock/evaluation_padlock_rng.pdf
Support is there starting with kernel 2.6.11.
> Actually IIRC there was some effort by Intel to build something into a
> CPU or some other part of a mobo chipset, but I don't know what came of
> it. The people who need this commercially are webvolken and encryption
It's still there in some chipsets, but you can't rely on it being there if
you don't control your hardware purchases.
> algorithms, where "unpredictable" is almost as good as "random to the
> nth bit of randomness". Nobody uses marginally secure encryption if
> they can help it. However, these folks are totally happy with
> kilorands/sec, let alone megarands/sec. I need hundreds of
> megarands/sec.
http://fp.gladman.plus.com/ACE/
claims almost 2 GByte/s throughput for AES. The bottleneck will be the GBit
NIC on a PCI bus.
--
Eugen* Leitl leitl
______________________________________________________________
ICBM: 48.07078, 11.61144 http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE
http://moleculardevices.org http://nanomachines.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL:
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From john.hearns at streamline-computing.com Fri Mar 11 03:34:43 2005
From: john.hearns at streamline-computing.com (John Hearns)
Date: Fri, 11 Mar 2005 08:34:43 +0000
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf cluster.
In-Reply-To:
References:
Message-ID: <1110530084.5349.24.camel@Vigor45>
On Thu, 2005-03-10 at 11:13 -0800, David Mathog wrote:
> "Omri Schwarz" wrote:
>
> > Would specialty (albeit commodity) coprocessors hanging off a
> > PCI slot be suitable for your applications?
>
> Well, a couple of points:
>
> 1. There may very well be no PCI slot available for such a
> coprocessor. Forget about it for a 1U,
Don't wish to be a pain,
but our 1U nodes have space for at least one PCI card.
MSI dual-Opteron nodes can take two.
One is on a riser card above the mainboard, one
is at 180 degrees to that.
Same motherboard used in Sun V20z, so can take two PCI.
I'd imagine the same for other suppliers.
Is your point that there may be a slot and a riser card,
but there is not enough space to fit a coprocessor card in
a 1U? I'll agree there.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From john.hearns at streamline-computing.com Fri Mar 11 04:44:47 2005
From: john.hearns at streamline-computing.com (John Hearns)
Date: Fri, 11 Mar 2005 09:44:47 +0000
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf cluster.
In-Reply-To:
References:
Message-ID: <1110534287.6204.3.camel@Vigor45>
On Thu, 2005-03-10 at 11:13 -0800, David Mathog wrote:
> "Omri Schwarz" wrote:
>
> > Would specialty (albeit commodity) coprocessors hanging off a
> > PCI slot be suitable for your applications?
>
> 2. Assuming that you can get the card in there and not fry anything
> you're going to have to at the very least rewrite the code to make
> use of the specialized hardware. Unless this board comes with a
> very, very clever compiler to automagically detect the bits it can
> do best you're looking at some serious time and/or money spent on
> programming.
>
We have looked at Clearspeed's product:
http://www.clearspeed.com/products/apps.php
They have ported standard libraries, and say that if your software uses
BLAS etc. it will run without change.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rgb at phy.duke.edu Fri Mar 11 08:24:14 2005
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Fri, 11 Mar 2005 08:24:14 -0500 (EST)
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf
In-Reply-To: <20050311082447.GG17303@leitl.org>
References:
<20050311082447.GG17303@leitl.org>
Message-ID:
On Fri, 11 Mar 2005, Eugen Leitl wrote:
> On Thu, Mar 10, 2005 at 08:08:17PM -0500, Robert G. Brown wrote:
>
> > Yeah, I've done all of these computations and looked into various
> > quantum devices and entropy based generators. However, you'd be
> > surprised how difficult it is to make a HARDWARE based random number
> > device that passes e.g. diehard. There is a deep truth here. "Random
>
> http://www.via.com.tw/en/downloads/whitepapers/initiatives/padlock/evaluation_padlock_rng.pdf
>
> Support is there starting with kernel 2.6.11.
Yes, and also see the attached, which is pretty much a linear
predecessor of this white paper prepared by the same company for Intel.
The von neumann transformation is what I was referring to as a means of
improving bitlevel statistics, but a) it costs you 1/2 of the total bits
generated to use; b) it only fixes the first moment of the bit
distribution (restores balance between 0's and 1's) but as both papers
empirically note, it doesn't fix either serial correlations due to at
best empirically known autocorrelation patterns or higher order (N at a
time) bit correlations).
Thus it is safe to say that while the sources are unpredictable, they
are not "random" (distributed as random numbers to all orders). And
they are slow. Or as you say (said) -- good for cryptography, not so
good for simulation, except to seed a decent prng. However, linux folks
can do just as well for cryptography with /dev/random, which doesn't
require anything at all. At least that's what I think -- dieharder
doesn't yet support the whole NIST STS suite but I personally think that
a whole lot of both STS and diehard will reduce to a single more direct
test of bitlevel randomness of a generator that I HAVE implemented.
But since none of this is published (yet) and hence referreed, I could
be mistaken.
> > Actually IIRC there was some effort by Intel to build something into a
> > CPU or some other part of a mobo chipset, but I don't know what came of
> > it. The people who need this commercially are webvolken and encryption
>
> It's still there in some chipsets, but you can't rely on it being there if
> you don't control your hardware purchases.
This was the origin of the white paper attached -- part of their
original engineering effort, I believe.
> > algorithms, where "unpredictable" is almost as good as "random to the
> > nth bit of randomness". Nobody uses marginally secure encryption if
> > they can help it. However, these folks are totally happy with
> > kilorands/sec, let alone megarands/sec. I need hundreds of
> > megarands/sec.
>
> http://fp.gladman.plus.com/ACE/
> claims almost 2 GByte/s throughput for AES. The bottleneck will be the GBit
> NIC on a PCI bus.
I'll have to look into this. If the rands are simulation quality and
the device not too expensive, this could easily be worth it. This is
the sort of thing I was hoping to turn up in this discussion.
rgb
>
>
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rgb at phy.duke.edu Fri Mar 11 08:28:15 2005
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Fri, 11 Mar 2005 08:28:15 -0500 (EST)
Subject: [Beowulf] Quasi-Non-Von-Neumann hardware in a Beowulf
In-Reply-To: <20050311082447.GG17303@leitl.org>
References:
<20050311082447.GG17303@leitl.org>
Message-ID:
On Fri, 11 Mar 2005, Eugen Leitl wrote:
OK, THIS time I'll attach the Intel paper. Oh wait. I can't -- the
list doesn't permit attachments. Damn, I think it is the listed as the
third hit on
Google "Intel random number generator"
Gotta go. Busy morning.
rgb
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From gmpc at sanger.ac.uk Fri Mar 11 09:46:24 2005
From: gmpc at sanger.ac.uk (Guy Coates)
Date: Fri, 11 Mar 2005 14:46:24 +0000 (GMT)
Subject: Clarification: [Beowulf] hpl - large problems fail
In-Reply-To: <4230D090.4060301@ufl.edu>
References:
<1110402646.5643.127.camel@Vigor11> <4230A757.1020203@ufl.edu>
<4230D090.4060301@ufl.edu>
Message-ID:
> the command prompt when I run it. It fails when it checks the solution
> to linear equations. The residual is too high and fails. This is part
> of the data from my HPL.out file:
>
This could still be dodgy memory; if bits get flipped then you can expect
those sorts of numerical instabilities.
Try running a single HPL job on each machine. If you get the correct
answer on 3 machines and the wrong answer on one, then you've narrowed it
down to hardware.
If you get the wrong answer on all your machines then you probably have a
software problem. Try recompiling HPL with no compiler optimisations, a
different compiler and/or blas library.
If that doesn't work, then it might just be possible that you are into
wierd hardware/kernel bug territory. I ran into similar HPL problems
whilst benchmarking a rather large hardware purchase we made several years
ago. The HPL residuals were coming out as NaN. Recompiling with a
different compiler gave the same result. Rather worryingly, the same
binaries ran correctly when run on different hardware. After alot of head
scratching and phonecalls to an extremely worried vendor ("Hey, this kit
you sold us can't do maths properly!") the problem was tracked down to a
dodgy kernel module. It turned out that the module provided by the vendor
to do console-over-lan stomped over the floating point registers under
certain circumstances.
Guy
--
Dr. Guy Coates, Informatics System Group
The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK
Tel: +44 (0)1223 834244 ex 7199
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From ctierney at HPTI.com Fri Mar 11 10:06:21 2005
From: ctierney at HPTI.com (Craig Tierney)
Date: Fri, 11 Mar 2005 08:06:21 -0700
Subject: Clarification: [Beowulf] hpl - large problems fail
In-Reply-To:
References:
<1110402646.5643.127.camel@Vigor11> <4230A757.1020203@ufl.edu>
<4230D090.4060301@ufl.edu>
Message-ID: <1110553581.2823.11.camel@localhost.localdomain>
On Fri, 2005-03-11 at 07:46, Guy Coates wrote:
> > the command prompt when I run it. It fails when it checks the solution
> > to linear equations. The residual is too high and fails. This is part
> > of the data from my HPL.out file:
> >
>
> This could still be dodgy memory; if bits get flipped then you can expect
> those sorts of numerical instabilities.
>
> Try running a single HPL job on each machine. If you get the correct
> answer on 3 machines and the wrong answer on one, then you've narrowed it
> down to hardware.
>
> If you get the wrong answer on all your machines then you probably have a
> software problem. Try recompiling HPL with no compiler optimisations, a
> different compiler and/or blas library.
>
>
> If that doesn't work, then it might just be possible that you are into
> wierd hardware/kernel bug territory. I ran into similar HPL problems
> whilst benchmarking a rather large hardware purchase we made several years
> ago. The HPL residuals were coming out as NaN. Recompiling with a
> different compiler gave the same result. Rather worryingly, the same
> binaries ran correctly when run on different hardware. After alot of head
> scratching and phonecalls to an extremely worried vendor ("Hey, this kit
> you sold us can't do maths properly!") the problem was tracked down to a
> dodgy kernel module. It turned out that the module provided by the vendor
> to do console-over-lan stomped over the floating point registers under
> certain circumstances.
>
It could also be the interconnect. If you are using ethernet,
I would think it is unlikely but I have seen issues with high-speed
interconnects where they had a problem with the PCI slot, and
we would get wrong answers when running HPL on more than 2 systems.
Craig
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From eugen at leitl.org Fri Mar 11 15:47:29 2005
From: eugen at leitl.org (Eugen Leitl)
Date: Fri, 11 Mar 2005 21:47:29 +0100
Subject: [Beowulf] Via Now Shipping Dual-Processor Mini-ITX Board
Message-ID: <20050311204729.GA17303@leitl.org>
Link: http://slashdot.org/article.pl?sid=05/03/11/1840251
Posted by: timothy, on 2005-03-11 20:20:00
from the smallosity dept.
An anonymous reader writes "Via is now shipping its [1]first
dual-processor mini-ITX board. The DP-310 features two 1GHz
processors, gigabit Ethernet, support for SATA drives, and a
media-processing graphics chipset. It targets high-density
applications -- according to Via, a 42-U rack with 168 processors
would draw about 2.5 kilowatts, or about as much power as two hair
dryers." This also looks like the basis for a nice car computer. Also
on the small-computing front, an anonymous reader submits "General
Micro, meanwhile, last week released what it calls the [2]world's
fastest mini-ITX board, powered by a Pentium M clocked up to 2.3GHz. "
References
1. http://linuxdevices.com/news/NS7109201579.html
2. http://linuxdevices.com/news/NS5099841192.html
----- End forwarded message -----
--
Eugen* Leitl leitl
______________________________________________________________
ICBM: 48.07078, 11.61144 http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE
http://moleculardevices.org http://nanomachines.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL:
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From nj at hemeris.com Fri Mar 11 03:28:33 2005
From: nj at hemeris.com (Nicolas Jungers)
Date: Fri, 11 Mar 2005 09:28:33 +0100
Subject: [Beowulf] How to disable GUI in Mac OS X?
In-Reply-To: <1110504727.3267.1.camel@linux>
References: <1110504727.3267.1.camel@linux>
Message-ID: <1110529713.6330.20.camel@lcube>
On Thu, 2005-03-10 at 19:32 -0600, Carlos Lopez Nataren wrote:
> I'm doing some performance test with a G5 Xserve, both with linux and
> MacOS X, but it seem unfair to test the performance of MacOS X with the
> graphical interface up, does anyone know how to disable the graphical
> interface so I just get a nice text console?
there is two way to do it:
- at boot, press Command-S to go in single user mode
- at login, use >console as login name
Regards,
Nicolas
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From trippm at gmail.com Fri Mar 11 15:56:26 2005
From: trippm at gmail.com (Mauricio Carrillo Tripp)
Date: Fri, 11 Mar 2005 15:56:26 -0500
Subject: [Beowulf] Terrible scaling with a Cisco Catalyst 2948G-GE-TX Switch
Message-ID:
Hi, I'm new in this list. I hope there's somebody here that can help me out.
I've been using a cluster with 16 nodes for quite a while, the switch I'm using
is a cheap DELL switch (PowerConnect 2624, ~$300). I run Molecular Dynamics
Simulations using NAMD, and the scaling I got was good (around 75%
performance on 16 nodes).
I recently got another 32 pc's to build a new cluster, but this time I had to
buy a more expensive switch (Cisco Catalyst 2948G-GE-TX, ~$3,000). To
my surprise, the scaling is just TERRIBLE, and after a lot of tests I finally
found that the problem is the switch (I'm not sure what or why exactly, though).
Using the Cisco switch or the Dell switch on the same cluster running
exactly the same program I get very different scaling (please see Fig. 3 at
http://chem.acad.wabash.edu/~trippm/Clusters/performance.php).
I logged in into the Cisco switch's interface and disabled spantree, enabled
fastport and set the ports to be always 1000tx (instead of being auto detect),
and nothing I do seems to help.
So, if there is someone else using this type of Cisco switch, could you tell me
if you found the same behaviour and figured out what was wrong?
Any other thoughts will be appreciated too.
Thanks.
--
Mauricio Carrillo Tripp, PhD
Department of Chemistry
Wabash College
trippm at wabash.edu
http://chem.acad.wabash.edu/~trippm
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From hahn at physics.mcmaster.ca Sat Mar 12 00:35:16 2005
From: hahn at physics.mcmaster.ca (Mark Hahn)
Date: Sat, 12 Mar 2005 00:35:16 -0500 (EST)
Subject: [Beowulf] Terrible scaling with a Cisco Catalyst 2948G-GE-TX
Switch
In-Reply-To:
Message-ID:
> is a cheap DELL switch (PowerConnect 2624, ~$300). I run Molecular Dynamics
relabeled SMC, iirc. reasonable commodity (in a good way) hardware.
> I recently got another 32 pc's to build a new cluster, but this time I had to
> buy a more expensive switch (Cisco Catalyst 2948G-GE-TX, ~$3,000). To
definitely an older-generation switch - can't even do full line rate
on all ports, since the backplane has just 12 Gbps bandwidth.
http://www.cisco.com/en/US/products/hw/switches/ps606/ps5502/
(which is feeling-lucky if you google the switch's name).
> my surprise, the scaling is just TERRIBLE, and after a lot of tests I finally
> found that the problem is the switch (I'm not sure what or why exactly, though).
bandwidth. the switch appears to be a hopped-up 100bT switch,
kind of sad actually.
> exactly the same program I get very different scaling (please see Fig. 3 at
> http://chem.acad.wabash.edu/~trippm/Clusters/performance.php).
when your speedup flattens and falls, you know you hit a bottleneck...
> Any other thoughts will be appreciated too.
ebay it; maybe someone will buy it for an office (where the design is
actually OK, since ports won't saturate for long.) Cisco doesn't have
any special magic in networking, certainly not in anything as commoditized
as ethernet switching.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From mechti01 at luther.edu Fri Mar 11 22:19:40 2005
From: mechti01 at luther.edu (Timo Mechler)
Date: Fri, 11 Mar 2005 21:19:40 -0600 (CST)
Subject: [Beowulf] Grants for Beowulf Clusters
Message-ID: <1883.172.17.11.39.1110597580.squirrel@172.17.11.39>
Hi all,
I'm wondering what kind of success rate people are having with obtaining
grants for Beowulf type Linux Clusters (for example, from the National
Science Foundation). Let me give you a little bit more info as to why I'm
asking this: I'm a junior undergraduate at a small liberal arts college
in Iowa (~2600 students), and have solely been pursuing Beowulf clusters
for well over a year now. I believe strongly that even though that my
school is small, several departments on campus could benefit from the use
of a beowulf cluster in the research that does go on. I've been using
older, slower machines as a proof of concept for now. Ideally, we would
want a faster beowulf system eventually that offers significant
improvements over anything desktop pc's have to offer nowadays. Being
that money is an issue at smaller schools, is there any I could obtain a
grant for a beowulf cluster? If so, besides the NSF, what would be some
other sources to apply to? Since some of you guys on this come from big
companies or Univesities, I would appreciate any insight and suggestions
you can give me. All input is appreciated. Thanks in advance!
Best Regards,
-Timo Mechler
--
Timo R. Mechler
mechti01 at luther.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From jake at spiekerfamily.com Sat Mar 12 22:26:11 2005
From: jake at spiekerfamily.com (Jake Thebault-Spieker)
Date: Sat, 12 Mar 2005 22:26:11 -0500
Subject: [Beowulf] Folding@Home on a Beowulf?
Message-ID: <4233B2D3.7000807@spiekerfamily.com>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Does anybody have any experience with Folding at Home
(http://folding.stanford.edu)? I'd like to run it on my six node 133MHz
CPU, but my cluster won't be online. Is there a way to get the folding
jobs another way? Like downloading them at a different location, then
transferring them to the cluster?
- --
I think computer viruses should count as life.
I think it says something about human nature
that the only form of life we have created so far is purely destructive.
We've created life in our own image.
- --Stephen Hawking
010010100110000101101
011011001010010000001
010100011010000110010
101100010011000010111
010101101100011101000
010110101010011011100
000110100101100101011
010110110010101110010
/www.plinko.net\\>
Jake Thebault-Spieker
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFCM7LTI2YvXV9Bxi0RApkcAJ93zARE+h0iqhrqAmP0JbCxXSgdnwCg4eWw
HuIQGgZXjZmXn99FcUKNFlc=
=opi/
-----END PGP SIGNATURE-----
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rgb at phy.duke.edu Mon Mar 14 09:47:02 2005
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Mon, 14 Mar 2005 09:47:02 -0500 (EST)
Subject: [Beowulf] Grants for Beowulf Clusters
In-Reply-To: <1883.172.17.11.39.1110597580.squirrel@172.17.11.39>
References: <1883.172.17.11.39.1110597580.squirrel@172.17.11.39>
Message-ID:
On Fri, 11 Mar 2005, Timo Mechler wrote:
> Hi all,
>
> I'm wondering what kind of success rate people are having with obtaining
> grants for Beowulf type Linux Clusters (for example, from the National
> Science Foundation). Let me give you a little bit more info as to why I'm
> asking this: I'm a junior undergraduate at a small liberal arts college
> in Iowa (~2600 students), and have solely been pursuing Beowulf clusters
> for well over a year now. I believe strongly that even though that my
> school is small, several departments on campus could benefit from the use
> of a beowulf cluster in the research that does go on. I've been using
> older, slower machines as a proof of concept for now. Ideally, we would
> want a faster beowulf system eventually that offers significant
> improvements over anything desktop pc's have to offer nowadays. Being
> that money is an issue at smaller schools, is there any I could obtain a
> grant for a beowulf cluster? If so, besides the NSF, what would be some
> other sources to apply to? Since some of you guys on this come from big
> companies or Univesities, I would appreciate any insight and suggestions
> you can give me. All input is appreciated. Thanks in advance!
Nearly all the university clusters in the world (with a few exceptions
like my or Jeff Layton's home clusters:-) are purchased with grant money
of one sort or another, so "success" or not it's the only game in town.
Businesses, of course, often pay out of pocket, but that's the game in
the big city (so to speak).
MOST of the university clusters are likely sponsored to do some specific
piece of grant-funded research. That is, I want to do Monte Carlo
research into the dynamic and static critical properties of continuous
Heisenberg ferromagnets, so I write a proposal to do this research that
contains a hardware budget for the cluster upon which I plan to do it.
However, the NSF has LOTS of grants for different categories, including
grants to stimulate and improve undergraduate educational experiences or
to build shared infrastructure. Few of them are likely to be available
to an undergraduate, though.
What I'd recommend is that you find a local faculty person (or three)
that you can convince to share your vision and that might "need" the
cluster either to teach students or to do some actual research (or
both). Perhaps one from computer science, one from physics, one from
chemistry or biology. See if all of you together can write an
institutional grant proposal for a startup cluster. Note that this need
not be terribly expensive or a lot of money -- you can build a perfectly
reasonable starter cluster for around $20-25K if you can get the school
to pick up the tab for power and cooling (likely to be a few thousand a
year). If the cluster is thoroughly firewalled from the university
network backbone, students could probably install it and administer it
and manage it and write applications for it for the participating
faculty or for course credit. It would be fun.
Note that you CAN get started for a lot less money than even this.
Barebones compute nodes can be had for almost nothing. Doug Eadline and
Jeff Layton recently demonstrated this rather spectacularly here:
http://www.clusterworld.com/value_cluster.shtml
This is basically an 8 node cluster made with all new components for
$2500. While they are experts (and hence can actually ride the bleeding
edge of what will work for the least possible amount of money) this does
indicate what can be done. More reasonably, you can get started for
something like $500 for miscellaneous stuff (shelving, network switch
and cabling, $1000 for a fairly nice desktop for a front end,
fileserver, head node, and somewhere between $300 and $600 per node,
where the range in prices reflects the amount of memory and hard disk
and kind of network per node. Check out pricewatch -- some of the
barebones, no-OS systems are less than $300 even for e.g. AMD-64s (not
exactly turtle-slow, that is:-).
What this means is that NSF or not you CAN build a student cluster at
your school. You can damn near fund it with a bake sale or other
simple/fun fundraisers. This kind of petty cash one can often convince
the school to pony up out of their petty cash budget, or you can start a
"cluster computing club" and (maybe) get a few thou out of your school's
clubs and activities budget. Or you and seven friends can each
contribute $500 tax-deductable dollars (yes, sigh, from your parents).
Or you can walk main street and knock on doors and ask local businesses
to fund it, or...
Anyway, you get the point. You don't need NSF money to get started, and
in fact it would be a lot easier for you to build a small cluster to get
started first and THEN look for that $20K grant from NSF to make it into
a middling big cluster!
rgb
>
> Best Regards,
>
> -Timo Mechler
>
>
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From james.p.lux at jpl.nasa.gov Mon Mar 14 10:04:53 2005
From: james.p.lux at jpl.nasa.gov (Jim Lux)
Date: Mon, 14 Mar 2005 07:04:53 -0800
Subject: [Beowulf] Grants for Beowulf Clusters
References: <1883.172.17.11.39.1110597580.squirrel@172.17.11.39>
Message-ID: <003401c528a7$2db91180$30f49580@LAPTOP152422>
I can't claim that I am successful at getting grants for clusters,
however...
If you can make a good case that a cluster will make it possible to solve
some other "important" problem, the odds go up greatly. Think of a cluster
as a tool, just like a microscope or an ultra centrifuge or a furnace. How
would you justify getting the budget for a big microscope (like a SEM)?
The key is to have a problem that everyone wants to attack, and the cluster
being the way to attack it. You said you've been doing proof of concept..
Is that to prove that you can build a cluster, or that you've demonstrated
some useful "work" with the cluster on a problem that someone is interested
in (i.e. for which there is funding available).
Otherwise, you're a solution looking for a problem.
Jim Lux
----- Original Message -----
From: "Timo Mechler"
To:
Sent: Friday, March 11, 2005 7:19 PM
Subject: [Beowulf] Grants for Beowulf Clusters
> Hi all,
>
> I'm wondering what kind of success rate people are having with obtaining
> grants for Beowulf type Linux Clusters (for example, from the National
> Science Foundation). Let me give you a little bit more info as to why I'm
> asking this: I'm a junior undergraduate at a small liberal arts college
> in Iowa (~2600 students), and have solely been pursuing Beowulf clusters
> for well over a year now. I believe strongly that even though that my
> school is small, several departments on campus could benefit from the use
> of a beowulf cluster in the research that does go on. I've been using
> older, slower machines as a proof of concept for now. Ideally, we would
> want a faster beowulf system eventually that offers significant
> improvements over anything desktop pc's have to offer nowadays. Being
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From diep at xs4all.nl Mon Mar 14 12:25:31 2005
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Mon, 14 Mar 2005 18:25:31 +0100
Subject: [Beowulf] Grants for Beowulf Clusters
Message-ID: <3.0.32.20050314182531.015df148@pop.xs4all.nl>
Oh well,
My chessprogram diep is looking for a cluster to run at important
tournaments at, also world champs computerchess. lots of publicity and
bragging rights in such a case :)
volunteers can email me: diep at xs4all.nl
warning: i need some testing time, in a competative league you can't afford
running without professional testing. Testing is 99% of the achievement. In
german they say: "ubung macht dem meister".
That is exactly the problems with grants to government. They do not
understand the need for testing at all. Getting testing time is impossible
there. That's why majority of all serious software doesn't run at such
clusters.
Jonathan Schaeffer told me:
"You must run at what you can test."
In general however, you see the biggest nonsense getting written to get
system time. Marketing departments are even worse however. Like: "this new
machine is going to research DNA". This where 0.5% of all system time goes
to such problems. A 2048 processor itanium2 supercomputer with most
expensive quad capable itanium2 cpu's (the dual cpu's are cheaper) is a bit
overkill for that. But well...
At 07:04 AM 3/14/2005 -0800, Jim Lux wrote:
>I can't claim that I am successful at getting grants for clusters,
>however...
>If you can make a good case that a cluster will make it possible to solve
>some other "important" problem, the odds go up greatly. Think of a cluster
>as a tool, just like a microscope or an ultra centrifuge or a furnace. How
>would you justify getting the budget for a big microscope (like a SEM)?
>
>The key is to have a problem that everyone wants to attack, and the cluster
>being the way to attack it. You said you've been doing proof of concept..
>Is that to prove that you can build a cluster, or that you've demonstrated
>some useful "work" with the cluster on a problem that someone is interested
>in (i.e. for which there is funding available).
>
>Otherwise, you're a solution looking for a problem.
>
>Jim Lux
>----- Original Message -----
>From: "Timo Mechler"
>To:
>Sent: Friday, March 11, 2005 7:19 PM
>Subject: [Beowulf] Grants for Beowulf Clusters
>
>
>> Hi all,
>>
>> I'm wondering what kind of success rate people are having with obtaining
>> grants for Beowulf type Linux Clusters (for example, from the National
>> Science Foundation). Let me give you a little bit more info as to why I'm
>> asking this: I'm a junior undergraduate at a small liberal arts college
>> in Iowa (~2600 students), and have solely been pursuing Beowulf clusters
>> for well over a year now. I believe strongly that even though that my
>> school is small, several departments on campus could benefit from the use
>> of a beowulf cluster in the research that does go on. I've been using
>> older, slower machines as a proof of concept for now. Ideally, we would
>> want a faster beowulf system eventually that offers significant
>> improvements over anything desktop pc's have to offer nowadays. Being
>
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
>
>
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From diep at xs4all.nl Mon Mar 14 12:34:12 2005
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Mon, 14 Mar 2005 18:34:12 +0100
Subject: [Beowulf] The move to gigabit - technical questions
Message-ID: <3.0.32.20050314183411.0122fa88@pop.xs4all.nl>
Good evening,
It's interesting to investigate what gigabit can do for small home clusters.
Any latency oriented approach is doomed to fail obviously at gigabit. But
they're cheap. For 40 euro i see several getting offered already.
First important question is of course how much system time those NIC's eat
when fully loading their bandwidth.
Example, i have an old dual k7 here with pci 2.2 (32 bits 33Mhz).
Suppose i put a gigabit card in it.
In say 6 messages a second i ship 8MB data at a time. Ship and send in turn.
So it ships a packet of 8MB, then receives a packet of 8MB.
Other than the cost of the thread to store the packet to RAM, does such a
card in any way stop or block the cpu's which are 100% loaded with
searching software (my chessprogram diep in this case)?
What penalty other than that thread handling the message is there in terms
of system time reduction to the 2 processes searching?
Oh btw, i assume that gigabit can handle 48MB/s user data a second?
Vincent
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From gotero at linuxprophet.com Tue Mar 15 01:43:06 2005
From: gotero at linuxprophet.com (Glen Otero)
Date: Mon, 14 Mar 2005 22:43:06 -0800
Subject: [Beowulf] pe's with SGE 6.0
Message-ID: <847768a8906d4db72b7d9d073b3f9c21@linuxprophet.com>
I think I broke something while playing with grid engine 6.0,
pvm-3.4.4-19, and mpich2. Anyone have pvm and mpi/mpich templates that
they know work in creating pe's with SGE 6.0?
Thanks!
Glen
Glen Otero Ph.D.
Linux Prophet
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 275 bytes
Desc: not available
URL:
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From atp at piskorski.com Tue Mar 15 06:05:41 2005
From: atp at piskorski.com (Andrew Piskorski)
Date: Tue, 15 Mar 2005 06:05:41 -0500
Subject: [Beowulf] The move to gigabit - technical questions
In-Reply-To: <3.0.32.20050314183411.0122fa88@pop.xs4all.nl>
References: <3.0.32.20050314183411.0122fa88@pop.xs4all.nl>
Message-ID: <20050315110541.GA68651@piskorski.com>
On Mon, Mar 14, 2005 at 06:34:12PM +0100, Vincent Diepeveen wrote:
> Good evening,
>
> It's interesting to investigate what gigabit can do for small home clusters.
>
> Any latency oriented approach is doomed to fail obviously at gigabit. But
> they're cheap. For 40 euro i see several getting offered already.
Which is 50 USD or so? You can smetimes get gigabit ethernet for MUCH
cheaper than that if you look around, particularly on Ebay. E.g., I
recently bought 10 Intel Pro/1000 MT cards (the lowest end 32 bit PCI
desktop kind, but still) for $11 each, from an Ebay vendor Ebay who
turned out to be local.
--
Andrew Piskorski
http://www.piskorski.com/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From dag at sonsorol.org Tue Mar 15 06:25:23 2005
From: dag at sonsorol.org (Chris Dagdigian)
Date: Tue, 15 Mar 2005 06:25:23 -0500
Subject: [Beowulf] Re: pe's with SGE 6.0
In-Reply-To: <47678a7e1f54e22f79f188cae34ec482@linuxprophet.com>
References: <47678a7e1f54e22f79f188cae34ec482@linuxprophet.com>
Message-ID: <4236C623.3040401@sonsorol.org>
Hi Glen,
Parallel environments (PE's) are "mostly" the same in Grid Engine 6 vs
5.3 in my experience.
The main "gotcha" difference is that in SGE 6 you tell the *qeueue* the
list of PE's it is able to support while in SGE 5 the opposite occured
-- the PE itself was configured with a list of queues that it was active
in. The other addition is the "urgency_slots" param (I think) which was
not in SGE 5.3.
If you had PE definitions or deployment scripts that worked in SGE 5.3
but not in 6 it may be due to the above. The "pe_list" parameter has
moved from the PE object itself and into the queue configuration.
For SGE 6 there are still the usual PVM and MPI templates and examples
that come with the distribution. Just look in $SGE_ROOT/pvm/ and
$SGE_ROOT/mpi/.
Reuti also just updated the Grid Engine tight LAMMPI HOWTO which is here:
http://gridengine.sunsource.net/project/gridengine/howto/lam-integration/lam-integration.html
Back to PE's ...
This is what a generic loosely integrated MPICH PE would look like in SGE 6:
> workgroupcluster:~ admin$ qconf -sp mpich
> pe_name mpich
> slots 512
> user_lists NONE
> xuser_lists NONE
> start_proc_args /common/sge/mpi/startmpi.sh $pe_hostfile
> stop_proc_args /common/sge/mpi/stopmpi.sh
> allocation_rule $fill_up
> control_slaves FALSE
> job_is_first_task TRUE
> urgency_slots min
Note that there is no list of queues that the PE runs in. This has moved.
The "pe_list" is now part of the queue configuration:
> workgroupcluster:~ admin$ qconf -sq all.q
> qname all.q
> hostlist @allhosts
> seq_no 0
> load_thresholds np_load_avg=1.75
> suspend_thresholds NONE
> nsuspend 1
> suspend_interval 00:05:00
> priority 0
> min_cpu_interval 00:05:00
> processors UNDEFINED
> qtype BATCH INTERACTIVE
> ckpt_list NONE
> pe_list make mpich
> rerun FALSE
< .... SNIP .... >
I've tried to list the differences between Grid Engine 5 and Grid Engine
6 at this URL:
http://bioteam.net/dag/gridengine-6-features.html
Not sure if I got it all but feedback/corrections are welcome.
Regards,
Chris
Glen Otero wrote:
> I think I broke something while playing with grid engine 6.0,
> pvm-3.4.4-19, and mpich2. Anyone have pvm and mpi/mpich templates that
> they know work in creating pe's with SGE 6.0?
>
> Thanks!
>
> Glen
>
> Glen Otero Ph.D.
>
--
Chris Dagdigian,
BioTeam - Independent life science IT & informatics consulting
Office: 617-665-6088, Mobile: 617-877-5498, Fax: 425-699-0193
PGP KeyID: 83D4310E iChat/AIM: bioteamdag Web: http://bioteam.net
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From felix.rauch.valenti at gmail.com Tue Mar 15 07:51:49 2005
From: felix.rauch.valenti at gmail.com (Felix Rauch Valenti)
Date: Tue, 15 Mar 2005 23:51:49 +1100
Subject: [Beowulf] Terrible scaling with a Cisco Catalyst 2948G-GE-TX
Switch
In-Reply-To:
References:
Message-ID: <4eafc81b05031504517e478fa9@mail.gmail.com>
On Fri, 11 Mar 2005 15:56:26 -0500, Mauricio Carrillo Tripp
wrote:
[...]
> I recently got another 32 pc's to build a new cluster, but this time I had to
> buy a more expensive switch (Cisco Catalyst 2948G-GE-TX, ~$3,000). To
> my surprise, the scaling is just TERRIBLE, and after a lot of tests I finally
> found that the problem is the switch (I'm not sure what or why exactly, though).
[...]
I don't know about the 2948G you mention above, but we had some
serious performance problems with a 2900XL about 4 years ago. You can
find my mails from back then in the list archives:
http://www.beowulf.org/archive/2001-August/004688.html
http://www.beowulf.org/archive/2002-May/007161.html
http://www.scyld.com/pipermail/beowulf/2001-May/003763.html
Basically the switch's performance broke down completely as soon as
more than half of its ports where fully loaded with full-duplex
communication.
- Felix
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From eugen at leitl.org Tue Mar 15 11:24:51 2005
From: eugen at leitl.org (Eugen Leitl)
Date: Tue, 15 Mar 2005 17:24:51 +0100
Subject: [Beowulf] CFP for ISPA'05 [Deadline April 1,
2005] (fwd from rabenseifner@hlrs.de)
Message-ID: <20050315162451.GK17303@leitl.org>
----- Forwarded message from Rolf Rabenseifner -----
From: Rolf Rabenseifner
Date: Tue, 15 Mar 2005 17:07:08 +0100 (CET)
To: eugen at leitl.org
Subject: CFP for ISPA'05 [Deadline April 1, 2005]
Dear HLRS User or member of my course-invitation-list,
as member of the program committee, I'm sending you the CFP
for the ISPA 2005.
Best regards
Rolf Rabenseifner
-------------------------------------------------------------------------
Third International Symposium on Parallel and Distributed Processing and
Applications (ISPA 2005) Nanjing, China, Nov. 2-5, 2005
URL: http://keysoftlab.nju.edu.cn/ispa2005/
Following the traditions of previous successful ISPA conferences, ISPA '03
(held in Aizu-Wakamatsu City, Japan) and ISPA '04 (held in Hong Kong),
the objective of ISPA '05 is to provide a forum for scientists and engineers in
academia and industry to exchange and discuss their experiences, new ideas,
research results, and applications about all aspects of parallel and
distributed computing and networking. ISPA '05 will feature session
presentations, workshops, tutorials and keynote speeches.
Topics of particular interest include, but are not limited to :
Computer networks
Network routing and communication algorithms
Parallel/distributed system architectures
Tools and environments for software development
Parallel/distributed algorithms
Parallel compilers
Parallel programming languages
Distributed systems
Wireless networks, mobile and pervasive computing
Reliability, fault-tolerance, and security
Performance evaluation and measurements
High-performance scientific and engineering computing
Internet computing and Web technologies
Database applications and data mining
Grid and cluster computing
Parallel/distributed applications
High performance bioinformatics
Submissions should include an abstract, key words, the e-mail address of the
corresponding author, and must not exceed 15 pages, including tables and
figures, with PDF, PostScript, or MS Word format. Electronic submission through
the submission website is strongly encouraged. Hard copies will be accepted
only if electronic submission is not possible. Submission of a paper should be
regarded as an undertaking that, should the paper be accepted, at least one of
the authors will register and attend the conference to present the work.
Important Dates:
Workshop proposals due:
April 1, 2005
Paper submission due: April 1, 2005
Acceptance notification: July 1, 2005
Camera-ready due: July 30, 2005
Conference: Nov. 2-5, 2005
Publication:
The proceedings of the symposium will be published in Springer's Lecture Notes
in Computer Science. A selection of the best papers for the conference will be
published in a special issue of The Journal of Supercomputing and International
Journal of High Performance Computing and Networking (IJHPCN).
General Co-Chairs:
Jack Dongarra, University of Tennessee, USA
Jiannong Cao, Hong Kong Polytechnic University, China
Jian Lu, Nanjing University, China
Program Co-Chair:
Yi Pan, Georgia State University, USA
Daoxu Chen, Nanjing University, China
Vice Program Co-Chairs:
Algorithms
Ivan Stojmenovic, University of Ottawa, Canada
Architecture and Networks
Mohamed Ould-Khaoua, University of Glasgow, UK
Middleware and Grid Computing
Mark Baker, University of Portsmouth, UK
Software
Jingling Xue, University of New South Wales, Australia
Applications
Zhi-Hua Zhou, Nanjing University, China
Steering Committee Co-Chairs
Sartaj Sahni, University of Florida, USA
Yaoxue Zhang, Ministry of Education, China
Minyi Guo, University of Aizu, Japan
Steering Committee:
Jiannong Cao, Hong Kong PolyU, China
Francis Lau, Univ. of Hong Kong, China
Yi Pan, Georgia State Univ. USA
Li Xie, Nanjing University, China
Jie Wu, Florida Altantic Univ. USA
Laurence T. Yang, St. Francis Xavier Univ. Canada
Hans P. Zima, California Institute of Technology, USA
Weiming Zheng, Tsinghua University, China
Local Organizing Committee Co-Chairs:
Xianglin Fei, Nanjing University, China
Baowen Xu, Southeast University, China
Ling Chen, Yangzhou University, China
Workshop Chair
Guihai Chen, Nanjing University, China
Tutorial Chair
Yuzhong Sun, Institute of Computing Technology, CAS, China
Publicity Chair:
Cho-Li Wang, Univ. of Hong Kong, China
Publication Chair:
Hui Wang, University of Aizu, Japan
Registration Chair:
Xianglin Fei, Nanjing University, China
Program Committee:
See web page http://keysoftlab.nju.edu.cn/ispa2005/ for details.
----- End forwarded message -----
--
Eugen* Leitl leitl
______________________________________________________________
ICBM: 48.07078, 11.61144 http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE
http://moleculardevices.org http://nanomachines.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL:
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From eugen at leitl.org Tue Mar 15 12:26:57 2005
From: eugen at leitl.org (Eugen Leitl)
Date: Tue, 15 Mar 2005 18:26:57 +0100
Subject: [Beowulf] [Bioclusters] Direct connect infiniband/quadrics? (fwd
from farul@aldrich.com.my)
Message-ID: <20050315172657.GR17303@leitl.org>
----- Forwarded message from Farul Mohd Ghazali -----
From: Farul Mohd Ghazali
Date: Wed, 16 Mar 2005 01:22:40 +0800
To: "Clustering, compute farming & distributed computing in life science informatics"
Subject: [Bioclusters] Direct connect infiniband/quadrics?
X-Mailer: Apple Mail (2.619.2)
Reply-To: "Clustering, compute farming & distributed computing in life science informatics"
Has anyone had any experience with a direct connect/point-to-point
implementation of Quadrics or Inifiniband? I talked to a small lab
doing some computational chemistry and molecular dynamics work and
they're interested in setting up a cluster but there is a need to
justify the cost of a cluster before the budget can be approved.
During the discussion, the idea of using direct connect infiniband or
quadrics on two dual or quad Opteron nodes came up as a testbed
platform to justify to management. From a price point of view, this is
very attractive since it'll probably cost less than $40,000 (two quad
Opterons, two Quadrics cards) for a testbed system. Money is tight...
So, is this setup workable? In theory this should be faster than a
gigabit based interconnect, even if it's just two nodes but I'd welcome
any other ideas/suggestions. Thanks.
-- "Leadership & Life-long Learning" --
Farul Mohd. Ghazali
Manager, Systems & Bioinformatics
Open Source Systems Sdn. Bhd.
www.aldrich.com.my Tel: +603-8656 0139/29 Fax: +603-8656 0132
_______________________________________________
Bioclusters maillist - Bioclusters at bioinformatics.org
https://bioinformatics.org/mailman/listinfo/bioclusters
----- End forwarded message -----
--
Eugen* Leitl leitl
______________________________________________________________
ICBM: 48.07078, 11.61144 http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE
http://moleculardevices.org http://nanomachines.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL:
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From gotero at linuxprophet.com Tue Mar 15 13:10:32 2005
From: gotero at linuxprophet.com (Glen Otero)
Date: Tue, 15 Mar 2005 10:10:32 -0800
Subject: [Beowulf] Re: [Bioclusters] Direct connect infiniband/quadrics?
In-Reply-To: <6b93220479d766d0bcbc61d55542b657@aldrich.com.my>
References: <9287779F96BDD311804100508B8B64230D4643E7@KANATAEXCHANGE1>
<41E688D7.7080302@scalableinformatics.com>
<41E68DCA.7080305@georgetown.edu>
<6b93220479d766d0bcbc61d55542b657@aldrich.com.my>
Message-ID: <493750150fb6900d633f50fc43108429@linuxprophet.com>
Speaking to the workable question--I've built small clusters just like
you described with Quadrics, Infiniband and Rocks. Quadrics support
isn't built into Rocks, but the Quadrics folks typically make their
software Rocks-aware. Infiniband support is supposedly built into
Rocks, but I haven't heard any success stories with it. It seems that
it's too Infinicon-centric. So you may be limited to using Infinicon
gear to make that work. I have experience making the openIB stuff work
with Rocks, so that is the path I would recommend. Mellanox and Topspin
are good to work with, as is Quadrics, when it comes to shoe-horning
their software onto Rocks clusters.
You'll see better performance even with two nodes. But I'd take a hard
look at what it will cost you to scale out with either interconnect and
decide if the difference in latency is worth the difference in price.
Glen
On Mar 15, 2005, at 9:22 AM, Farul Mohd Ghazali wrote:
>
> Has anyone had any experience with a direct connect/point-to-point
> implementation of Quadrics or Inifiniband? I talked to a small lab
> doing some computational chemistry and molecular dynamics work and
> they're interested in setting up a cluster but there is a need to
> justify the cost of a cluster before the budget can be approved.
>
> During the discussion, the idea of using direct connect infiniband or
> quadrics on two dual or quad Opteron nodes came up as a testbed
> platform to justify to management. From a price point of view, this is
> very attractive since it'll probably cost less than $40,000 (two quad
> Opterons, two Quadrics cards) for a testbed system. Money is tight...
>
> So, is this setup workable? In theory this should be faster than a
> gigabit based interconnect, even if it's just two nodes but I'd
> welcome any other ideas/suggestions. Thanks.
>
>
> -- "Leadership & Life-long Learning" --
>
> Farul Mohd. Ghazali
> Manager, Systems & Bioinformatics
> Open Source Systems Sdn. Bhd.
> www.aldrich.com.my Tel: +603-8656 0139/29 Fax: +603-8656 0132
>
> _______________________________________________
> Bioclusters maillist - Bioclusters at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters
>
>
Glen Otero Ph.D.
Linux Prophet
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 2264 bytes
Desc: not available
URL:
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From jzamor at gmail.com Tue Mar 15 02:51:36 2005
From: jzamor at gmail.com (Josh Zamor)
Date: Tue, 15 Mar 2005 00:51:36 -0700
Subject: [Beowulf] Seg Fault with pvm_upkstr() and Linux.
Message-ID: <19430ea71246a4e9a13899873914bff7@gmail.com>
I have just started in on programming using PVM and have run into an
odd problem. I have written a C (c99) program that calculates the
factorial of a number by calculating parts of the range. I'm using the
GMP for dealing with large numbers (I have done this program
successfully before using numerous methods including pthreads). The
basic way it works for the cluster is that the program starts on a
machine, determines the subrange to be calculated for each task and
then waits for each process to come back with the answer for it's
subrange. The main process then finds the total result of the factorial
from multiplying the subrange results back together... Pretty
standard...
The problem is that after a subrange is calculated by a task the result
is put into a character array (null terminated, created from GMP's
mpz_getstr()), it is packaged using pvm_pkstr() and sent to the parent.
The parent can receive this using pvm_recv(), but as soon as it tries
to store this into a character array in the program using pvm_upkstr(),
or explore it with pvm_bufinfo(), it segfaults.
I've also tried this in a couple of different ways, passing ints, it
works, but passing strings (either large or small) results in a
segfault.
Of my 2 "cluster" learning setup one is running Mac OSX and the other
is running Gentoo Linux. It is only the linux box that segfaults, the
OSX box finds the answer correctly. The segfault happens whenever the
linux box is part of the PVM created cluster, or when PVM is ran alone
(no other boxes in the cluster) on the linux box.
If anyone has any idea why this is happening, or is only happening on
the linux box I would be grateful. Thanks, tech details follow:
Gentoo Linux Box:
Proc: AMD AthlonXP 1700+
RAM: 512MB.
Kernel: 2.4.26-gentoo-r9
PVM: 3.4.5
GMP: 4.1.4
GCC: 3.3.5
Mac OSX:
Proc: G4 1GHz
RAM: 768MB
Kernel: 10.3.8, Mach 7.8.0
PVM: 3.4.5
GMP: 4.1.4
GCC: 3.3 (20030304)
Thanks again.
Regards,
-J Zamor
jzamor at gmail.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From bjornts at mi.uib.no Tue Mar 15 08:47:53 2005
From: bjornts at mi.uib.no (Bjorn Tore Sund)
Date: Tue, 15 Mar 2005 14:47:53 +0100 (CET)
Subject: [Beowulf] Re: Grants for Beowulf Clusters
In-Reply-To: <200503142001.j2EK0FuW014048@bluewest.scyld.com>
References: <200503142001.j2EK0FuW014048@bluewest.scyld.com>
Message-ID:
On Mon, 14 Mar 2005 beowulf-request at beowulf.org wrote:
> Message: 4
> Date: Mon, 14 Mar 2005 07:04:53 -0800
> From: "Jim Lux"
> Subject: Re: [Beowulf] Grants for Beowulf Clusters
> To: "Timo Mechler" ,
> Message-ID: <003401c528a7$2db91180$30f49580 at LAPTOP152422>
> Content-Type: text/plain; charset="iso-8859-1"
>
> I can't claim that I am successful at getting grants for clusters,
> however...
> If you can make a good case that a cluster will make it possible to solve
> some other "important" problem, the odds go up greatly. Think of a cluster
> as a tool, just like a microscope or an ultra centrifuge or a furnace. How
> would you justify getting the budget for a big microscope (like a SEM)?
>
> The key is to have a problem that everyone wants to attack, and the cluster
> being the way to attack it. You said you've been doing proof of concept..
> Is that to prove that you can build a cluster, or that you've demonstrated
> some useful "work" with the cluster on a problem that someone is interested
> in (i.e. for which there is funding available).
>
> Otherwise, you're a solution looking for a problem.
Moving further away from the original question, I'd like to expand on
the above. A Beowulf cluster can be used to solve problems. You need
the problem. You also need to make sure that any and all clusters you
get have the necessary usage volume to warrant the purchase.
Assume you've successfully argued/proved that the problem you're trying
to solve can be addressed by using a Beowulf cluster. Do you have the
human resources to actually do so? There's been several cases of people
getting clusters to address specific problems, only to discover they
don't have time to learn how to code properly for them, and nor does
anyone else.
And if your funding source is then of a type that frowns on their money
being spent with nothing really happening as a result, finding funding
for more useful stuff later isn't going to be easy.
Bj?rn
--
Bj?rn Tore Sund Phone: (+47) 555-84894 Stupidity is like a
System administrator Fax: (+47) 555-89672 fractal; universal and
Math. Department Mobile: (+47) 918 68075 infinitely repetitive.
University of Bergen VIP: 81724
Support: system at mi.uib.no Contact: teknisk at mi.uib.no Direct: bjornts at mi.uib.no
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From Karl.Bellve at umassmed.edu Mon Mar 14 10:43:32 2005
From: Karl.Bellve at umassmed.edu (Karl Bellve)
Date: Mon, 14 Mar 2005 10:43:32 -0500
Subject: [Beowulf] Reference Paper
Message-ID: <4235B124.8000705@umassmed.edu>
Is there a Beowulf paper by Donald Becker that I could reference in an
upcoming publication?
--
Cheers,
Karl Bellve, Ph.D. ICQ # 13956200
Biomedical Imaging Group TLCA# 7938
University of Massachusetts
Email: Karl.Bellve at umassmed.edu
Phone: (508) 856-6514
Fax: (508) 856-1840
PGP Public key: finger kdb at molmed.umassmed.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From Glen.Gardner at verizon.net Mon Mar 14 17:41:37 2005
From: Glen.Gardner at verizon.net (Glen Gardner)
Date: Mon, 14 Mar 2005 17:41:37 -0500
Subject: [Beowulf] The move to gigabit - technical questions
References: <3.0.32.20050314183411.0122fa88@pop.xs4all.nl>
Message-ID: <42361321.5070402@verizon.net>
Gigabit will be a little faster than 100Mbit on a small cluster, but not
a lot.
I ended up using 5 cheap gigabit switches to make a gigabit concentrator
for my 12 node cluster.
It eliminated the tendency for the network to saturate under a heavy load.
It also let me use gigabit network cards in my I/O node and controlling
node with a small improvement in file I/O.
The compute nodes remaind with 100 Mbit to conserve power. The setup
works rather nicely.
Glen
Vincent Diepeveen wrote:
>Good evening,
>
>It's interesting to investigate what gigabit can do for small home clusters.
>
>Any latency oriented approach is doomed to fail obviously at gigabit. But
>they're cheap. For 40 euro i see several getting offered already.
>
>First important question is of course how much system time those NIC's eat
>when fully loading their bandwidth.
>
>Example, i have an old dual k7 here with pci 2.2 (32 bits 33Mhz).
>Suppose i put a gigabit card in it.
>
>In say 6 messages a second i ship 8MB data at a time. Ship and send in turn.
>
>So it ships a packet of 8MB, then receives a packet of 8MB.
>
>Other than the cost of the thread to store the packet to RAM, does such a
>card in any way stop or block the cpu's which are 100% loaded with
>searching software (my chessprogram diep in this case)?
>
>What penalty other than that thread handling the message is there in terms
>of system time reduction to the 2 processes searching?
>
>Oh btw, i assume that gigabit can handle 48MB/s user data a second?
>
>Vincent
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
>
>
--
Glen E. Gardner, Jr.
AA8C
AMSAT MEMBER 10593
Glen.Gardner at verizon.net
http://members.bellatlantic.net/~vze24qhw/index.html
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From billharman at comcast.net Mon Mar 14 15:47:34 2005
From: billharman at comcast.net (Bill Harman)
Date: Mon, 14 Mar 2005 13:47:34 -0700
Subject: [Beowulf] Grants for Beowulf Clusters
In-Reply-To: <1883.172.17.11.39.1110597580.squirrel@172.17.11.39>
Message-ID: <200503142044.j2EKiXSf014963@bluewest.scyld.com>
You can always try the Oak Ridge National Lab approach - The Stone
SouperComputer - their original Beowulf cluster, which is no longer in
operation, but was made up of used and discarded PC's.
http://stonesoup.esd.ornl.gov/
Go around to different department in the University of to local business and
ask for their used equipment, how knows, maybe you can build a 64-node note
heterogeneous cluster, with the right price.
Bill Harman,
Salt Lake City office
P - (801) 572-9252 F - (801) 571-4927
wharman at prism.net
billharman at comcast.net
skype: harman8015729252
-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On
Behalf Of Timo Mechler
Sent: Friday, March 11, 2005 8:20 PM
To: beowulf at beowulf.org
Subject: [Beowulf] Grants for Beowulf Clusters
Hi all,
I'm wondering what kind of success rate people are having with obtaining
grants for Beowulf type Linux Clusters (for example, from the National
Science Foundation). Let me give you a little bit more info as to why I'm
asking this: I'm a junior undergraduate at a small liberal arts college in
Iowa (~2600 students), and have solely been pursuing Beowulf clusters for
well over a year now. I believe strongly that even though that my school is
small, several departments on campus could benefit from the use of a beowulf
cluster in the research that does go on. I've been using older, slower
machines as a proof of concept for now. Ideally, we would want a faster
beowulf system eventually that offers significant improvements over anything
desktop pc's have to offer nowadays. Being that money is an issue at
smaller schools, is there any I could obtain a grant for a beowulf cluster?
If so, besides the NSF, what would be some other sources to apply to? Since
some of you guys on this come from big companies or Univesities, I would
appreciate any insight and suggestions you can give me. All input is
appreciated. Thanks in advance!
Best Regards,
-Timo Mechler
--
Timo R. Mechler
mechti01 at luther.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From aleahy at knox.edu Mon Mar 14 12:23:48 2005
From: aleahy at knox.edu (Andrew Leahy)
Date: Mon, 14 Mar 2005 11:23:48 -0600
Subject: [Beowulf] Grants for Beowulf Clusters
In-Reply-To: <1883.172.17.11.39.1110597580.squirrel@172.17.11.39>
References: <1883.172.17.11.39.1110597580.squirrel@172.17.11.39>
Message-ID: <4235C8A4.70908@knox.edu>
We had some success at getting funding from the NSF/CCLI program for a
college-level beowulf cluster:
https://www.fastlane.nsf.gov/servlet/showaward?award=0089045
It might be a good idea to search through NSF Fastlane to see what else
they've funded through CCLI.
I personally don't believe that a dedicated beowulf cluster is
necessarily a good investment for a strictly undergraduate institution.
We may be outliers among liberal arts colleges, but many of our
scientists aren't all that computationally-minded and so far our cluster
hasn't seen any use outside of the mathematics and computer science
departments. However, our computer scientists were looking for a linux
lab (on our exclusively Windows campus) for use in instruction. So we
argued that we could build a dual use linux lab/beowulf cluster that we
could hold classes in and use as a cluster.
Technically, this isn't a beowulf cluster. However, we gave each system
dual processors (reasoning that a student simply typing away on a
document wouldn't really notice if one of the processors was working on
a computationally intensive program at the same time) and we equipped
each system with two network cards, one of which was hooked up to our
own dedicated "beowulf" network. So you might say it was "beowulf-like".
I can't say that our grant was a tremendous success, but that's largely
for external reasons. We built it at the height of the ".com" era, when
there were gobs of computer science students. As it was originally
envisioned, the course we designed would be a numerical analysis course
with an emphasis on parallel algorithms (a sexy subject for CS
types--there isn't anything else about distributed memory programming in
the curriculum) primarily for solving large systems of linear equations.
However, when ".com" went bust CS enrollments dried up and we haven't
really had a big audience. Right now, I'm retooling the course to focus
on applied partial differential equations, again with an emphasis on
solving large systems of equations with parallel algorithms. We'll see
if it can pick up a broader audience among science students in general.
If anybody else has ideas for distributed computing topics that would go
well in an undergraduate numerical analysis course, please let me know.
Andrew Leahy
Knox College
Timo Mechler wrote:
> Hi all,
>
> I'm wondering what kind of success rate people are having with obtaining
> grants for Beowulf type Linux Clusters (for example, from the National
> Science Foundation). Let me give you a little bit more info as to why I'm
> asking this: I'm a junior undergraduate at a small liberal arts college
> in Iowa (~2600 students), and have solely been pursuing Beowulf clusters
> for well over a year now. I believe strongly that even though that my
> school is small, several departments on campus could benefit from the use
> of a beowulf cluster in the research that does go on. I've been using
> older, slower machines as a proof of concept for now. Ideally, we would
> want a faster beowulf system eventually that offers significant
> improvements over anything desktop pc's have to offer nowadays. Being
> that money is an issue at smaller schools, is there any I could obtain a
> grant for a beowulf cluster? If so, besides the NSF, what would be some
> other sources to apply to? Since some of you guys on this come from big
> companies or Univesities, I would appreciate any insight and suggestions
> you can give me. All input is appreciated. Thanks in advance!
>
> Best Regards,
>
> -Timo Mechler
>
---
[This E-mail scanned for viruses by Declude Virus]
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From bropers at cct.lsu.edu Mon Mar 14 09:23:24 2005
From: bropers at cct.lsu.edu (Brian D. Ropers-Huilman)
Date: Mon, 14 Mar 2005 08:23:24 -0600
Subject: [Beowulf] Grants for Beowulf Clusters
In-Reply-To: <1883.172.17.11.39.1110597580.squirrel@172.17.11.39>
References: <1883.172.17.11.39.1110597580.squirrel@172.17.11.39>
Message-ID: <42359E5C.6020504@cct.lsu.edu>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Timo,
Unfortunately, the NSF and DOE wells are running dry. Actually, we are
quite confused as to what these major organizations are doing in terms of
funding and believe they do not have a clear strategy as of yet. We even
had Peter Freeman down last fall to give a talk and still were not
completely clear on the NSF direction, though this recent article might
shed some light [
http://www.informationweek.com/story/showArticle.jhtml?articleID=159401335&tid=5978
].
In our estimation, NIH, with a focus on bioinformatics, is the best likely
source. We have also learned that major collaboratories are more likely to
be funded. These agencies no longer want to fund department X's little
cluster, department Y's little cluster, and then department Z's as well.
They would prefer that the departments come together and get funding for a
single larger system.
I realize this does not directly answer your question, but thought I would
provide my viewpoint.
P.S. I was born and raised in Platteville, Wisconsin, not 100 miles from
Decorah, through my undergraduate studies at Madison and my first several
years in the corporate world. Both my and my partner's families are still
there and we both still call Wisconsin "home."
Timo Mechler said the following on 2005.03.11 21:19:
> Hi all,
>
> I'm wondering what kind of success rate people are having with obtaining
> grants for Beowulf type Linux Clusters (for example, from the National
> Science Foundation). Let me give you a little bit more info as to why I'm
> asking this: I'm a junior undergraduate at a small liberal arts college
> in Iowa (~2600 students), and have solely been pursuing Beowulf clusters
> for well over a year now. I believe strongly that even though that my
> school is small, several departments on campus could benefit from the use
> of a beowulf cluster in the research that does go on. I've been using
> older, slower machines as a proof of concept for now. Ideally, we would
> want a faster beowulf system eventually that offers significant
> improvements over anything desktop pc's have to offer nowadays. Being
> that money is an issue at smaller schools, is there any I could obtain a
> grant for a beowulf cluster? If so, besides the NSF, what would be some
> other sources to apply to? Since some of you guys on this come from big
> companies or Univesities, I would appreciate any insight and suggestions
> you can give me. All input is appreciated. Thanks in advance!
>
> Best Regards,
>
> -Timo Mechler
>
- --
Brian D. Ropers-Huilman .:. Asst. Director .:. HPC and Computation
Center for Computation & Technology (CCT) bropers at cct.lsu.edu
Johnston Hall, Rm. 350 +1 225.578.3272 (V)
Louisiana State University +1 225.578.5362 (F)
Baton Rouge, LA 70803-1900 USA http://www.cct.lsu.edu/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Darwin)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFCNZ5bwRr6eFHB5lgRAoX9AKDYFTwIM+DL1TUFdVnOmoQrtZ/HEgCeIo/s
uEz8BodFo/g0N11CbQhomQA=
=K8Vk
-----END PGP SIGNATURE-----
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From diep at xs4all.nl Tue Mar 15 15:11:44 2005
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Tue, 15 Mar 2005 21:11:44 +0100
Subject: [Beowulf] The move to gigabit - technical questions
Message-ID: <3.0.32.20050315211144.016291f0@pop.xs4all.nl>
At 05:41 PM 3/14/2005 -0500, Glen Gardner wrote:
>Gigabit will be a little faster than 100Mbit on a small cluster, but not
>a lot.
What is 'not a lot'.
I would guess it's factor 10 faster in bandwidth?
>I ended up using 5 cheap gigabit switches to make a gigabit concentrator
>for my 12 node cluster.
>It eliminated the tendency for the network to saturate under a heavy load.
Very interesting, can you post a connection scheme and routing table?
>It also let me use gigabit network cards in my I/O node and controlling
>node with a small improvement in file I/O.
Streaming i/o or random access?
cheapo disk arrays get what is it, 400MB/s handsdown or so?
that's raid5 readspeed, plenty security at a raid5 array.
>The compute nodes remaind with 100 Mbit to conserve power. The setup
>works rather nicely.
what type of software do you run at it,
embarrassingly parallel software?
Vincent
>Glen
>
>Vincent Diepeveen wrote:
>
>>Good evening,
>>
>>It's interesting to investigate what gigabit can do for small home clusters.
>>
>>Any latency oriented approach is doomed to fail obviously at gigabit. But
>>they're cheap. For 40 euro i see several getting offered already.
>>
>>First important question is of course how much system time those NIC's eat
>>when fully loading their bandwidth.
>>
>>Example, i have an old dual k7 here with pci 2.2 (32 bits 33Mhz).
>>Suppose i put a gigabit card in it.
>>
>>In say 6 messages a second i ship 8MB data at a time. Ship and send in turn.
>>
>>So it ships a packet of 8MB, then receives a packet of 8MB.
>>
>>Other than the cost of the thread to store the packet to RAM, does such a
>>card in any way stop or block the cpu's which are 100% loaded with
>>searching software (my chessprogram diep in this case)?
>>
>>What penalty other than that thread handling the message is there in terms
>>of system time reduction to the 2 processes searching?
>>
>>Oh btw, i assume that gigabit can handle 48MB/s user data a second?
>>
>>Vincent
>>
>>_______________________________________________
>>Beowulf mailing list, Beowulf at beowulf.org
>>To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
>>
>>
>>
>
>--
>Glen E. Gardner, Jr.
>AA8C
>AMSAT MEMBER 10593
>Glen.Gardner at verizon.net
>
>
>http://members.bellatlantic.net/~vze24qhw/index.html
>
>
>
>
>
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From james.p.lux at jpl.nasa.gov Tue Mar 15 16:09:00 2005
From: james.p.lux at jpl.nasa.gov (Jim Lux)
Date: Tue, 15 Mar 2005 13:09:00 -0800
Subject: [Beowulf] The move to gigabit - technical questions
References: <3.0.32.20050315211144.016291f0@pop.xs4all.nl>
Message-ID: <004201c529a3$35e70db0$32a8a8c0@LAPTOP152422>
----- Original Message -----
From: "Vincent Diepeveen"
To: "Glen Gardner"
Cc: ;
Sent: Tuesday, March 15, 2005 12:11 PM
Subject: Re: [Beowulf] The move to gigabit - technical questions
> At 05:41 PM 3/14/2005 -0500, Glen Gardner wrote:
> >Gigabit will be a little faster than 100Mbit on a small cluster, but not
> >a lot.
>
> What is 'not a lot'.
>
> I would guess it's factor 10 faster in bandwidth?
I would guess it's not 10 times faster (leaving aside latency and bus
bandwidth to the interface issues). I don't know about the details, but it's
real common to have some sort of synchronization/equalization sequence at
the front of the packet that runs at a lower bit rate than the payload rate.
Wired "thicknet" ethernet, as I recall, had a 64 bit alternating 1/0
preamble before the actual packet contents w/header. It could be adequately
modeled, though, as some sort of fixed per packet overhead time.
This is especially true for wireless LANs (I know that not many clusters use
these, but as they get faster, and there's more channels available, it gets
attractive.. no cables!). 802.11a/b/g always starts at 2 Mb/sec for the
preamble, and then shifts to a faster modulation (depending on the
propagation).
Jim Lux
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From brian at cypher.acomp.usf.edu Tue Mar 15 18:21:45 2005
From: brian at cypher.acomp.usf.edu (Brian R Smith)
Date: Tue, 15 Mar 2005 18:21:45 -0500
Subject: [Beowulf] Folding@Home on a Beowulf?
In-Reply-To: <4233B2D3.7000807@spiekerfamily.com>
References: <4233B2D3.7000807@spiekerfamily.com>
Message-ID: <42376E09.30207@cypher.acomp.usf.edu>
Not really a beowulf question, but i'll bite. Some time ago, we had a
condor grid at my center with some pretty wild machines. The problem
was that no one had any use for it (they were obsessed with running
their projects on our beowulf despite the fact that they were mostly
single processor jobs). So i decided i'd run folding at home on 6 of the
nodes, just to put them to use somehow (I usually use one of our _true_
beowulfs for my work and anything serious).
Just set up f at h on each machine, giving each a different machine id
number, 1-6 should suffice. As for getting the data sets, you should
probably find some way of getting these nodes online... Do you have a
switch or some extra ethernet ports? Maybe use one of the machines as a
router? You could also whip up a script that would "pretend" to be each
instance of a machine id (1-6), receive the data, then place it in the
proper directory in each node.
Your script would run on a machine with internet access. It would start
a copy of f at h as a particular machine id (i think you can do this by
feeding it different config files). Once it has retreived the data,
kill the process, move the data to one of your nodes and repeat. As for
sending the data back... i'm at a loss.
Its been a while since I've run this, as it is more of a distributed
computing project than a beowulf-ready application, so i might be a
little off on this. However, you get the general idea.
-brian
Jake Thebault-Spieker wrote:
>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>Does anybody have any experience with Folding at Home
>(http://folding.stanford.edu)? I'd like to run it on my six node 133MHz
>CPU, but my cluster won't be online. Is there a way to get the folding
>jobs another way? Like downloading them at a different location, then
>transferring them to the cluster?
>
>- --
>I think computer viruses should count as life.
>I think it says something about human nature
>that the only form of life we have created so far is purely destructive.
>We've created life in our own image.
>- --Stephen Hawking
>
>010010100110000101101
>011011001010010000001
>010100011010000110010
>101100010011000010111
>010101101100011101000
>010110101010011011100
>000110100101100101011
>010110110010101110010
>/www.plinko.net\\>
>Jake Thebault-Spieker
>-----BEGIN PGP SIGNATURE-----
>Version: GnuPG v1.2.5 (MingW32)
>Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
>
>iD8DBQFCM7LTI2YvXV9Bxi0RApkcAJ93zARE+h0iqhrqAmP0JbCxXSgdnwCg4eWw
>HuIQGgZXjZmXn99FcUKNFlc=
>=opi/
>-----END PGP SIGNATURE-----
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
>
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rgb at phy.duke.edu Wed Mar 16 09:41:40 2005
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 16 Mar 2005 09:41:40 -0500 (EST)
Subject: [Beowulf] The move to gigabit - technical questions
In-Reply-To: <3.0.32.20050315211144.016291f0@pop.xs4all.nl>
References: <3.0.32.20050315211144.016291f0@pop.xs4all.nl>
Message-ID:
On Tue, 15 Mar 2005, Vincent Diepeveen wrote:
> At 05:41 PM 3/14/2005 -0500, Glen Gardner wrote:
> >Gigabit will be a little faster than 100Mbit on a small cluster, but not
> >a lot.
>
> What is 'not a lot'.
>
> I would guess it's factor 10 faster in bandwidth?
(Maybe, you don't get QUITE 100% of the raw clock advantage in all
applications on all hardware, Vincent;-). However, for most
applications on most hardware you >>should<< get a signficant advantage
-- 80-95% of 10x, or 8-9.5x. Not a just "a little".
A really, really cheap switch might have problems with bisection
bandwidth and chop this down for simultaneous flat-out bidirectional
data streams, but relatively few parallel applications engage in
flat-out bidirectional communications. Even if it does, your problem is
more likely to be with resource contention (e.g. two hosts trying to
talk to a third at the same time) than it is with actual bandwidth
oversubscription. This is what Vincent is suggesting that you look into
(or let us look into:-) below.
If your particular usage pattern does create resource contention, then
you might well need to either hand-optimize the pattern to avoid
saturating your cheap hardware, create a network with cheap components
that effectively breaks up the pathological communications pattern
(which it sounds like is what you actually did) or buy better hardware
(either better gigE switches or a "real" HPC network).
However you shouldn't really trash gigE itself -- it isn't at fault and
your results aren't typical.
rgb
>
> >I ended up using 5 cheap gigabit switches to make a gigabit concentrator
> >for my 12 node cluster.
> >It eliminated the tendency for the network to saturate under a heavy load.
>
> Very interesting, can you post a connection scheme and routing table?
>
> >It also let me use gigabit network cards in my I/O node and controlling
> >node with a small improvement in file I/O.
>
> Streaming i/o or random access?
>
> cheapo disk arrays get what is it, 400MB/s handsdown or so?
>
> that's raid5 readspeed, plenty security at a raid5 array.
>
> >The compute nodes remaind with 100 Mbit to conserve power. The setup
> >works rather nicely.
>
> what type of software do you run at it,
> embarrassingly parallel software?
>
> Vincent
>
> >Glen
> >
> >Vincent Diepeveen wrote:
> >
> >>Good evening,
> >>
> >>It's interesting to investigate what gigabit can do for small home clusters.
> >>
> >>Any latency oriented approach is doomed to fail obviously at gigabit. But
> >>they're cheap. For 40 euro i see several getting offered already.
> >>
> >>First important question is of course how much system time those NIC's eat
> >>when fully loading their bandwidth.
> >>
> >>Example, i have an old dual k7 here with pci 2.2 (32 bits 33Mhz).
> >>Suppose i put a gigabit card in it.
> >>
> >>In say 6 messages a second i ship 8MB data at a time. Ship and send in turn.
> >>
> >>So it ships a packet of 8MB, then receives a packet of 8MB.
> >>
> >>Other than the cost of the thread to store the packet to RAM, does such a
> >>card in any way stop or block the cpu's which are 100% loaded with
> >>searching software (my chessprogram diep in this case)?
> >>
> >>What penalty other than that thread handling the message is there in terms
> >>of system time reduction to the 2 processes searching?
> >>
> >>Oh btw, i assume that gigabit can handle 48MB/s user data a second?
> >>
> >>Vincent
> >>
> >>_______________________________________________
> >>Beowulf mailing list, Beowulf at beowulf.org
> >>To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
> >>
> >>
> >>
> >
> >--
> >Glen E. Gardner, Jr.
> >AA8C
> >AMSAT MEMBER 10593
> >Glen.Gardner at verizon.net
> >
> >
> >http://members.bellatlantic.net/~vze24qhw/index.html
> >
> >
> >
> >
> >
>
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rgb at phy.duke.edu Wed Mar 16 09:30:41 2005
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 16 Mar 2005 09:30:41 -0500 (EST)
Subject: [Beowulf] Seg Fault with pvm_upkstr() and Linux.
In-Reply-To: <19430ea71246a4e9a13899873914bff7@gmail.com>
References: <19430ea71246a4e9a13899873914bff7@gmail.com>
Message-ID:
On Tue, 15 Mar 2005, Josh Zamor wrote:
> I have just started in on programming using PVM and have run into an
> odd problem. I have written a C (c99) program that calculates the
> factorial of a number by calculating parts of the range. I'm using the
> GMP for dealing with large numbers (I have done this program
> successfully before using numerous methods including pthreads). The
> basic way it works for the cluster is that the program starts on a
> machine, determines the subrange to be calculated for each task and
> then waits for each process to come back with the answer for it's
> subrange. The main process then finds the total result of the factorial
> from multiplying the subrange results back together... Pretty
> standard...
>
> The problem is that after a subrange is calculated by a task the result
> is put into a character array (null terminated, created from GMP's
> mpz_getstr()), it is packaged using pvm_pkstr() and sent to the parent.
> The parent can receive this using pvm_recv(), but as soon as it tries
> to store this into a character array in the program using pvm_upkstr(),
> or explore it with pvm_bufinfo(), it segfaults.
>
> I've also tried this in a couple of different ways, passing ints, it
> works, but passing strings (either large or small) results in a
> segfault.
This SOUNDS like programming error -- using a pointer as an int or vice
versa.
I'd do two things -- look at the actual result produced by GMP on the
client side in some detail -- dumping it bytewise a character at a time
isn't that dumb an idea. GMP introduces all sorts of new types, and
I'll BET that these types are structs, not the actual data. So is the
result a normal pointer-addressable string or a struct? Maybe what you
are returning is a container for a pointer to anonymous memory on the
client, not the contents of that memory... (Note that I've never used
GMP so don't know, but you definitely need to check to make sure that
what you are returning is an actual complete data object and not a
container e.g. a struct or linked list).
I assume that you've experimented and have no difficulty returning and
unpacking ordinary ints, strings, or raw data blocks with PVM. If so
you probably aren't making a pointer error on the master server side,
although it never hurts to check.
If you want other eyes on your actual code (might be useful if it is
indeed programmer error) please post.
rgb
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rgb at phy.duke.edu Wed Mar 16 10:52:07 2005
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 16 Mar 2005 10:52:07 -0500 (EST)
Subject: [Beowulf] The move to gigabit - technical questions
In-Reply-To:
References: <3.0.32.20050315211144.016291f0@pop.xs4all.nl>
Message-ID:
On Wed, 16 Mar 2005, Robert G. Brown wrote:
> On Tue, 15 Mar 2005, Vincent Diepeveen wrote:
>
> > At 05:41 PM 3/14/2005 -0500, Glen Gardner wrote:
> > >Gigabit will be a little faster than 100Mbit on a small cluster, but not
> > >a lot.
> >
> > What is 'not a lot'.
> >
> > I would guess it's factor 10 faster in bandwidth?
I hate to reply to myself, but I should have noted that the below
applies to BANDWIDTH, not latency, dominated communictions. It was
implicit from Vincent's reply, but I should have made it explicit. For
lots of small packets gigabit's advantage probably won't be 10x, and
this is another case where a higher-end network is indicated. However,
the latency probably won't change a lot with different switches or
switch arrangements, either, except for the worse along paths with
multiple switch hops in between.
I should also have pointed out to the original poster that there are
nice tools (e.g. netperf, netpipe, lmbench) that will help him analyze
his raw network performance outside of a particular application that
might well have poor "networking" performance for reasons that have
nothing to do with the actual network. There are also lots of articles
out there both in the list archives, in Cluster World magazine back
issues, in linux magazine back issues, and on various websites
(including mine and brahma's) that can really help one understand just
what ethernet is and how it works and what its numbers should be. It is
the most widely implemented and widely understood network, good, bad,
and ugly features notwithstanding.
rgb
>
> (Maybe, you don't get QUITE 100% of the raw clock advantage in all
> applications on all hardware, Vincent;-). However, for most
> applications on most hardware you >>should<< get a signficant advantage
> -- 80-95% of 10x, or 8-9.5x. Not a just "a little".
>
> A really, really cheap switch might have problems with bisection
> bandwidth and chop this down for simultaneous flat-out bidirectional
> data streams, but relatively few parallel applications engage in
> flat-out bidirectional communications. Even if it does, your problem is
> more likely to be with resource contention (e.g. two hosts trying to
> talk to a third at the same time) than it is with actual bandwidth
> oversubscription. This is what Vincent is suggesting that you look into
> (or let us look into:-) below.
>
> If your particular usage pattern does create resource contention, then
> you might well need to either hand-optimize the pattern to avoid
> saturating your cheap hardware, create a network with cheap components
> that effectively breaks up the pathological communications pattern
> (which it sounds like is what you actually did) or buy better hardware
> (either better gigE switches or a "real" HPC network).
>
> However you shouldn't really trash gigE itself -- it isn't at fault and
> your results aren't typical.
>
> rgb
>
> >
> > >I ended up using 5 cheap gigabit switches to make a gigabit concentrator
> > >for my 12 node cluster.
> > >It eliminated the tendency for the network to saturate under a heavy load.
> >
> > Very interesting, can you post a connection scheme and routing table?
> >
> > >It also let me use gigabit network cards in my I/O node and controlling
> > >node with a small improvement in file I/O.
> >
> > Streaming i/o or random access?
> >
> > cheapo disk arrays get what is it, 400MB/s handsdown or so?
> >
> > that's raid5 readspeed, plenty security at a raid5 array.
> >
> > >The compute nodes remaind with 100 Mbit to conserve power. The setup
> > >works rather nicely.
> >
> > what type of software do you run at it,
> > embarrassingly parallel software?
> >
> > Vincent
> >
> > >Glen
> > >
> > >Vincent Diepeveen wrote:
> > >
> > >>Good evening,
> > >>
> > >>It's interesting to investigate what gigabit can do for small home clusters.
> > >>
> > >>Any latency oriented approach is doomed to fail obviously at gigabit. But
> > >>they're cheap. For 40 euro i see several getting offered already.
> > >>
> > >>First important question is of course how much system time those NIC's eat
> > >>when fully loading their bandwidth.
> > >>
> > >>Example, i have an old dual k7 here with pci 2.2 (32 bits 33Mhz).
> > >>Suppose i put a gigabit card in it.
> > >>
> > >>In say 6 messages a second i ship 8MB data at a time. Ship and send in turn.
> > >>
> > >>So it ships a packet of 8MB, then receives a packet of 8MB.
> > >>
> > >>Other than the cost of the thread to store the packet to RAM, does such a
> > >>card in any way stop or block the cpu's which are 100% loaded with
> > >>searching software (my chessprogram diep in this case)?
> > >>
> > >>What penalty other than that thread handling the message is there in terms
> > >>of system time reduction to the 2 processes searching?
> > >>
> > >>Oh btw, i assume that gigabit can handle 48MB/s user data a second?
> > >>
> > >>Vincent
> > >>
> > >>_______________________________________________
> > >>Beowulf mailing list, Beowulf at beowulf.org
> > >>To change your subscription (digest mode or unsubscribe) visit
> > http://www.beowulf.org/mailman/listinfo/beowulf
> > >>
> > >>
> > >>
> > >
> > >--
> > >Glen E. Gardner, Jr.
> > >AA8C
> > >AMSAT MEMBER 10593
> > >Glen.Gardner at verizon.net
> > >
> > >
> > >http://members.bellatlantic.net/~vze24qhw/index.html
> > >
> > >
> > >
> > >
> > >
> >
>
>
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From diep at xs4all.nl Wed Mar 16 11:10:33 2005
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Wed, 16 Mar 2005 17:10:33 +0100
Subject: [Beowulf] The move to gigabit - technical questions
Message-ID: <3.0.32.20050316171033.0162d318@pop.xs4all.nl>
At 10:52 AM 3/16/2005 -0500, Robert G. Brown wrote:
>On Wed, 16 Mar 2005, Robert G. Brown wrote:
>
>> On Tue, 15 Mar 2005, Vincent Diepeveen wrote:
>>
>> > At 05:41 PM 3/14/2005 -0500, Glen Gardner wrote:
>> > >Gigabit will be a little faster than 100Mbit on a small cluster, but
not
>> > >a lot.
>> >
>> > What is 'not a lot'.
>> >
>> > I would guess it's factor 10 faster in bandwidth?
>
>I hate to reply to myself, but I should have noted that the below
>applies to BANDWIDTH, not latency, dominated communictions. It was
Well Robert, it's obvious you understood me correct. I was talking about
bandwidth and i find a factor 8-9 times faster bandwidth of moving from
100mbit to 1gbit a *considerable* jump forward, especially because the
price for such network cards is one that everyone can afford.
For latency and even more bandwidth we all know there is highend network
cards like dolphin, quadrics, myri and hopefully also infiniband (i see at
the linux kernel list a lot of postings regarding infiniband, which are the
actual *brands* that sell a concrete card right now which can be bought
stand alone? If so url?).
For distributed shared memory (DSM) what supercomputers need, there is
quadrics. Is there any other highend network card where one can approach
the RAM of the cards directly? (like with the shmem Cray library quadrics
has and i can find them on their homepage).
Obviously for any latency issue one doesn't buy cheapo gigabit cards. So
the remaining interesting thing is what type of bandwidth it can give us.
At my 100mbit LANs i measured i could effectively put roughly 60 mbit
throughput to it (bidirectional, so 60 mbit in total of sends and receives).
A 8.5 to 9 times higher bandwidth would mean say roughly give 480 mbit
which is about 60MB/s. Obviously there will be theoretic testsetups getting
more.
Not so interesting when discussing practical working of the LAN's in
operational systems.
Actually 60MB/s would be too little for my chess program.
lmbench is not so impressive as a benchmark. It doesn't measure TLB
trashing of main memory.
I've had past years so so many discussions with professors who do not
understand the difference between bandwidth latency that lmbench gives
versus their actual application that is just busy TLB trashing which it
doesn't measure very accurately.
Example is that at my dual k7's doing a single read of 8 bytes in a 400MB
buffer is eating up about 400 ns exactly on average using my own simplistic
testset. Such researches however work with like 60ns on paper as lmbench
gives it to them.
I feel the word latency has been overused in that respect. The word
'bandwidth' however is very clear in this context.
>implicit from Vincent's reply, but I should have made it explicit. For
>lots of small packets gigabit's advantage probably won't be 10x, and
>this is another case where a higher-end network is indicated. However,
>the latency probably won't change a lot with different switches or
>switch arrangements, either, except for the worse along paths with
>multiple switch hops in between.
>
>I should also have pointed out to the original poster that there are
>nice tools (e.g. netperf, netpipe, lmbench) that will help him analyze
>his raw network performance outside of a particular application that
>might well have poor "networking" performance for reasons that have
>nothing to do with the actual network. There are also lots of articles
>out there both in the list archives, in Cluster World magazine back
>issues, in linux magazine back issues, and on various websites
>(including mine and brahma's) that can really help one understand just
>what ethernet is and how it works and what its numbers should be. It is
>the most widely implemented and widely understood network, good, bad,
>and ugly features notwithstanding.
>
> rgb
>
>>
>> (Maybe, you don't get QUITE 100% of the raw clock advantage in all
>> applications on all hardware, Vincent;-). However, for most
>> applications on most hardware you >>should<< get a signficant advantage
>> -- 80-95% of 10x, or 8-9.5x. Not a just "a little".
>>
>> A really, really cheap switch might have problems with bisection
>> bandwidth and chop this down for simultaneous flat-out bidirectional
>> data streams, but relatively few parallel applications engage in
>> flat-out bidirectional communications. Even if it does, your problem is
>> more likely to be with resource contention (e.g. two hosts trying to
>> talk to a third at the same time) than it is with actual bandwidth
>> oversubscription. This is what Vincent is suggesting that you look into
>> (or let us look into:-) below.
>>
>> If your particular usage pattern does create resource contention, then
>> you might well need to either hand-optimize the pattern to avoid
>> saturating your cheap hardware, create a network with cheap components
>> that effectively breaks up the pathological communications pattern
>> (which it sounds like is what you actually did) or buy better hardware
>> (either better gigE switches or a "real" HPC network).
>>
>> However you shouldn't really trash gigE itself -- it isn't at fault and
>> your results aren't typical.
>>
>> rgb
>>
>> >
>> > >I ended up using 5 cheap gigabit switches to make a gigabit
concentrator
>> > >for my 12 node cluster.
>> > >It eliminated the tendency for the network to saturate under a heavy
load.
>> >
>> > Very interesting, can you post a connection scheme and routing table?
>> >
>> > >It also let me use gigabit network cards in my I/O node and controlling
>> > >node with a small improvement in file I/O.
>> >
>> > Streaming i/o or random access?
>> >
>> > cheapo disk arrays get what is it, 400MB/s handsdown or so?
>> >
>> > that's raid5 readspeed, plenty security at a raid5 array.
>> >
>> > >The compute nodes remaind with 100 Mbit to conserve power. The setup
>> > >works rather nicely.
>> >
>> > what type of software do you run at it,
>> > embarrassingly parallel software?
>> >
>> > Vincent
>> >
>> > >Glen
>> > >
>> > >Vincent Diepeveen wrote:
>> > >
>> > >>Good evening,
>> > >>
>> > >>It's interesting to investigate what gigabit can do for small home
clusters.
>> > >>
>> > >>Any latency oriented approach is doomed to fail obviously at
gigabit. But
>> > >>they're cheap. For 40 euro i see several getting offered already.
>> > >>
>> > >>First important question is of course how much system time those
NIC's eat
>> > >>when fully loading their bandwidth.
>> > >>
>> > >>Example, i have an old dual k7 here with pci 2.2 (32 bits 33Mhz).
>> > >>Suppose i put a gigabit card in it.
>> > >>
>> > >>In say 6 messages a second i ship 8MB data at a time. Ship and send
in turn.
>> > >>
>> > >>So it ships a packet of 8MB, then receives a packet of 8MB.
>> > >>
>> > >>Other than the cost of the thread to store the packet to RAM, does
such a
>> > >>card in any way stop or block the cpu's which are 100% loaded with
>> > >>searching software (my chessprogram diep in this case)?
>> > >>
>> > >>What penalty other than that thread handling the message is there in
terms
>> > >>of system time reduction to the 2 processes searching?
>> > >>
>> > >>Oh btw, i assume that gigabit can handle 48MB/s user data a second?
>> > >>
>> > >>Vincent
>> > >>
>> > >>_______________________________________________
>> > >>Beowulf mailing list, Beowulf at beowulf.org
>> > >>To change your subscription (digest mode or unsubscribe) visit
>> > http://www.beowulf.org/mailman/listinfo/beowulf
>> > >>
>> > >>
>> > >>
>> > >
>> > >--
>> > >Glen E. Gardner, Jr.
>> > >AA8C
>> > >AMSAT MEMBER 10593
>> > >Glen.Gardner at verizon.net
>> > >
>> > >
>> > >http://members.bellatlantic.net/~vze24qhw/index.html
>> > >
>> > >
>> > >
>> > >
>> > >
>> >
>>
>>
>
>--
>Robert G. Brown http://www.phy.duke.edu/~rgb/
>Duke University Dept. of Physics, Box 90305
>Durham, N.C. 27708-0305
>Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
>
>
>
>
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From James.P.Lux at jpl.nasa.gov Wed Mar 16 12:22:29 2005
From: James.P.Lux at jpl.nasa.gov (Jim Lux)
Date: Wed, 16 Mar 2005 09:22:29 -0800
Subject: [Beowulf] The move to gigabit - technical questions
In-Reply-To:
References: <3.0.32.20050315211144.016291f0@pop.xs4all.nl>
Message-ID: <6.1.1.1.2.20050316092047.0282ae10@mail.jpl.nasa.gov>
At 06:41 AM 3/16/2005, Robert G. Brown wrote:
>On Tue, 15 Mar 2005, Vincent Diepeveen wrote:
>
> > At 05:41 PM 3/14/2005 -0500, Glen Gardner wrote:
> > >Gigabit will be a little faster than 100Mbit on a small cluster, but not
> > >a lot.
> >
> > What is 'not a lot'.
> >
> > I would guess it's factor 10 faster in bandwidth?
>
>(Maybe, you don't get QUITE 100% of the raw clock advantage in all
>applications on all hardware, Vincent;-). However, for most
>applications on most hardware you >>should<< get a signficant advantage
>-- 80-95% of 10x, or 8-9.5x. Not a just "a little".
>
One might want to go back through the archives and see what the difference
between 10 Mbps and 100 Mbps Ethernet was, and more to the point, why it
wasn't 10x faster. While the details might change, the basic issues remain:
"wire speed"
"protocol overhead"
"physical interface to PC's CPU"
"drivers"
James Lux, P.E.
Spacecraft Radio Frequency Subsystems Group
Flight Communications Systems Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From diep at xs4all.nl Wed Mar 16 12:37:47 2005
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Wed, 16 Mar 2005 18:37:47 +0100
Subject: [Beowulf] Seg Fault with pvm_upkstr() and Linux.
Message-ID: <3.0.32.20050316183746.0148e730@pop.xs4all.nl>
At 12:51 AM 3/15/2005 -0700, Josh Zamor wrote:
>I have just started in on programming using PVM and have run into an
>odd problem. I have written a C (c99) program that calculates the
>factorial of a number by calculating parts of the range. I'm using the
>GMP for dealing with large numbers (I have done this program
>successfully before using numerous methods including pthreads). The
Did you configure GMP correctly?
For math with big numbers it default does not use FFT calculations but way
slower methods. You might want to recompile it with FFT enabled in case you
didn't do this yet.
Please note that GMP is a very slow library, commercial libraries are up to
factor 3 faster (lineair speed not exponential, more efficient
implementation).
I personally use GMP with big pleasure.
>basic way it works for the cluster is that the program starts on a
>machine, determines the subrange to be calculated for each task and
>then waits for each process to come back with the answer for it's
>subrange. The main process then finds the total result of the factorial
>from multiplying the subrange results back together... Pretty
>standard...
>The problem is that after a subrange is calculated by a task the result
>is put into a character array (null terminated, created from GMP's
>mpz_getstr()), it is packaged using pvm_pkstr() and sent to the parent.
>The parent can receive this using pvm_recv(), but as soon as it tries
>to store this into a character array in the program using pvm_upkstr(),
>or explore it with pvm_bufinfo(), it segfaults.
In general in parallel programming the worst performance you get when all
processes must report to 1 central process.
It's far more efficient when each process is equal and 'divides' the work
done.
A simple calculation example of a problem i had at a 512 processor SGI is
that each 'hub' can handle at most 680MB data per second (for 4 processors
in total yes).
However if 499 other processors start reading/writing from/to this 'hub'
then real disasters will happen.
Things will completely lock up. Not only because all processors must divide
the small bandwidth, but also because you will get switch latency overhead
problems of routers and switches.
If they first must stream a few bytes data from A to B and then suddenly
from C to D, that's far less efficient than when 1 switch/router must
stream only from A to B.
Switches and routers sometimes have their own cache which is optimized for
those benchmark streaming tests simply. Switch latency can cause serious
problems if all processors want to use the same communication resources.
The general rule is to keep the routers/switches as less possible as busy
and try to make embarrassingly as possible parallel software.
>I've also tried this in a couple of different ways, passing ints, it
>works, but passing strings (either large or small) results in a
>segfault.
>Of my 2 "cluster" learning setup one is running Mac OSX and the other
>is running Gentoo Linux. It is only the linux box that segfaults, the
>OSX box finds the answer correctly. The segfault happens whenever the
>linux box is part of the PVM created cluster, or when PVM is ran alone
>(no other boxes in the cluster) on the linux box.
which compiler do you compile with?
I hope gcc only and not intel c++?
intel c++ is notorious with floating points in order to get faster at
benchmarks.
Are you busy with floating point or with integers?
>If anyone has any idea why this is happening, or is only happening on
>the linux box I would be grateful. Thanks, tech details follow:
>Gentoo Linux Box:
> Proc: AMD AthlonXP 1700+
> RAM: 512MB.
> Kernel: 2.4.26-gentoo-r9
> PVM: 3.4.5
> GMP: 4.1.4
> GCC: 3.3.5
>Mac OSX:
> Proc: G4 1GHz
> RAM: 768MB
> Kernel: 10.3.8, Mach 7.8.0
> PVM: 3.4.5
> GMP: 4.1.4
> GCC: 3.3 (20030304)
Are you using PGO with gcc? (pgo = profile guided optimizations)
There is major bugs even in latest 3.4.3 gcc in the PGO.
Those guys are all volunteers and very cool guys.
Very slow in bugfixing as they have other jobs too, and i don't blame them.
Vincent
>Thanks again.
>
>Regards,
>-J Zamor
>jzamor at gmail.com
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
>
>
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From ajt at rri.sari.ac.uk Wed Mar 16 07:21:23 2005
From: ajt at rri.sari.ac.uk (Tony Travis)
Date: Wed, 16 Mar 2005 12:21:23 +0000
Subject: [Beowulf] Folding@Home on a Beowulf?
In-Reply-To: <42376E09.30207@cypher.acomp.usf.edu>
References: <4233B2D3.7000807@spiekerfamily.com>
<42376E09.30207@cypher.acomp.usf.edu>
Message-ID: <423824C3.2090809@rri.sari.ac.uk>
Brian R Smith wrote:
> [...]
> Just set up f at h on each machine, giving each a different machine id
> number, 1-6 should suffice. As for getting the data sets, you should
> probably find some way of getting these nodes online... Do you have a
> switch or some extra ethernet ports? Maybe use one of the machines as a
> router? You could also whip up a script that would "pretend" to be each
> instance of a machine id (1-6), receive the data, then place it in the
> proper directory in each node.
Hello, Brian.
We've run both SETI at home and folding at home on our 64-node openMosix
Beowulf cluster using David Ranch's software firewall on the 'head' node
to allow IP masquerading of the compute nodes on the public internet
through our private cluster LAN:
http://en.tldp.org/HOWTO/IP-Masquerade-HOWTO/
It works very well :-)
Tony.
--
Dr. A.J.Travis, | mailto:ajt at rri.sari.ac.uk
Rowett Research Institute, | http://www.rri.sari.ac.uk/~ajt
Greenburn Road, Bucksburn, | phone:+44 (0)1224 712751
Aberdeen AB21 9SB, Scotland, UK. | fax:+44 (0)1224 716687
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From jzamor at gmail.com Wed Mar 16 15:14:52 2005
From: jzamor at gmail.com (Josh Zamor)
Date: Wed, 16 Mar 2005 13:14:52 -0700
Subject: [Beowulf] Seg Fault with pvm_upkstr() and Linux.
In-Reply-To:
References: <19430ea71246a4e9a13899873914bff7@gmail.com>
Message-ID:
On Mar 16, 2005, at 7:30 AM, Robert G. Brown wrote:
> I assume that you've experimented and have no difficulty returning and
> unpacking ordinary ints, strings, or raw data blocks with PVM. If so
> you probably aren't making a pointer error on the master server side,
> although it never hurts to check.
Actually, I had tried passing ints and the like in other programs and
had no problem, but when I tired to pass just a string in my factorial
program, not using GMP, but still having the GMP library, it would
segfault. What I hadn't tried was to create a simple test program for
just sending strings using PVM. I have now written a test program that
does not use the GMP and tries to send a basic string back from a child
to the parent. This again will segfault on the linux machine and work
just fine on the OSX machine (essentially BSD). The code that I tried
is posted below. For convenience I have also posted the source of this
test program, my simple factorial program, Makefiles, and script
outputs from both OSX and the linux machine at the following address:
http://www.cet.nau.edu/~jrz4/pvmTest/
--strPVM.c--
#include
#include
int main(int argc, char** argv) {
int info, mytid, myparent, child[2];
if(mytid = pvm_mytid() < 0) {
pvm_perror("Could not get mytid");
return -1;
}
myparent = pvm_parent();
if((myparent < 0) && (myparent != PvmNoParent)) {
pvm_perror("Some odd errr for my parent");
pvm_exit();
return -1;
}
/* I am parent */
if(myparent == PvmNoParent) {
info = pvm_spawn(argv[0], NULL, PvmTaskDefault, NULL, 2, child);
for(int i = 0; i < 2; ++i) {
if(child[i] < 0)
printf(" %d", child[i]);
else
printf("t%x\t", child[i]);
}
putchar('\n');
if(info != 2) {
pvm_perror("Kids didn't all spawn!");
pvm_exit();
return -1;
}
for(int i = 0; i < 2; ++i) {
char* retStr;
info = pvm_recv(-1, 11);
info = pvm_upkstr(retStr);
printf("Recieved return string: %s\n", retStr);
}
pvm_exit();
return 0;
} else { /* Child follows */
char str[3];
str[0] = 'a';
str[1] = 'b';
str[3] = (char)0;
pvm_initsend(PvmDataDefault);
pvm_pkstr(str);
pvm_send(myparent, 11);
pvm_exit();
return 0;
}
}
Also, for the character array "str" in the child segment above, I have
tried using malloc to create the memory, and using const char arrays as
well. While all of these methods give seg faults on the linux machine,
the above way was the only way that I tried that worked and didn't give
a bus error on Mac OSX.
Thanks again.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From diep at xs4all.nl Wed Mar 16 16:07:51 2005
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Wed, 16 Mar 2005 22:07:51 +0100
Subject: [Beowulf] Seg Fault with pvm_upkstr() and Linux.
Message-ID: <3.0.32.20050316220751.01633148@pop.xs4all.nl>
At 01:25 PM 3/16/2005 -0700, Josh Zamor wrote:
>
>On Mar 16, 2005, at 10:37 AM, Vincent Diepeveen wrote:
>>
>> Did you configure GMP correctly?
>>
>> For math with big numbers it default does not use FFT calculations but
>> way
>> slower methods. You might want to recompile it with FFT enabled in
>> case you
>> didn't do this yet.
>>
>
>I actually haven't done this, though I'll certainly try that soon.
To quote a friend of mine: "Good programmers do not blink their eyes to
speedup scientific written software by a factor 1 million".
>>
>> In general in parallel programming the worst performance you get when
>> all
>> processes must report to 1 central process.
>>
>> It's far more efficient when each process is equal and 'divides' the
>> work
>> done.
>>
>> A simple calculation example of a problem i had at a 512 processor SGI
>> is
>> that each 'hub' can handle at most 680MB data per second (for 4
>> processors
>> in total yes).
>>
>> However if 499 other processors start reading/writing from/to this
>> 'hub'
>> then real disasters will happen.
>>
>> Things will completely lock up. Not only because all processors must
>> divide
>> the small bandwidth, but also because you will get switch latency
>> overhead
>> problems of routers and switches.
>>
>> If they first must stream a few bytes data from A to B and then
>> suddenly
>> from C to D, that's far less efficient than when 1 switch/router must
>> stream only from A to B.
>>
>> Switches and routers sometimes have their own cache which is optimized
>> for
>> those benchmark streaming tests simply. Switch latency can cause
>> serious
>> problems if all processors want to use the same communication
>> resources.
>>
>> The general rule is to keep the routers/switches as less possible as
>> busy
>> and try to make embarrassingly as possible parallel software.
>
>
>This is exactly the sort of thing that I will be looking for shortly,
>do you have any recommendations on either books or online texts that
>cover this sort of thing (best practices when programming for
>clusters)? I'm currently just experimenting, but this is a field that I
>think I want to get involved in.
Paranoia sir, is the only thing i can advice you. Never believe any
datapoint a manufacturer gives you until you can prove it yourself.
SGI claimed towards me for example that a random lookup at a remote
processor of Origin3800 at 512p partition would cost me no more than 460
nanoseconds to get 8-128 bytes.
Of course i benchmarked when the time was there at 460 processors and on
average a read of 8 bytes took 5.8 us. Something like 100MB ram per
processor. Each processor is doing every new read of 8 bytes a random
lookup to a random processor at a random memory location within that
460*100MB.
SGI is no exception.
Manufacturers in highend have the problem that competitors have such
unrealistic numbers that the only way they can sell their stuff is by doing
an even more incredible claim.
I had a cluster guy of another very large huge blue company swear to me
that the one way pingpong latency of the just build 20xx processor machine
was under 5 us, using network cards of the most sold highend network card
on the planet in supercomputers.
So i asked a friend to do that pingpong at just a partition of 128 nodes
and it was 8 us there. Let alone 1000+.
Later confronted that person friend with it and it was: "well i hope you
realize that the problem is that what you measure is including the MPI
overhead which can be significant, the numbers i quoted were measured
without that stupid overhead".
But well, you actual make software and *do* need to count at that 'stupid
overhead' to be true.
>
>>
>> which compiler do you compile with?
>>
>> I hope gcc only and not intel c++?
>>
>> intel c++ is notorious with floating points in order to get faster at
>> benchmarks.
>>
>> Are you busy with floating point or with integers?
>
>
>I am currently using gcc's c compiler, ver 3.3.x. And doing mostly
>integer calculations currently.
gcc should have no bugs there, except for PGO.
>
>>
>> Are you using PGO with gcc? (pgo = profile guided optimizations)
>>
>> There is major bugs even in latest 3.4.3 gcc in the PGO.
>>
>> Those guys are all volunteers and very cool guys.
>>
>> Very slow in bugfixing as they have other jobs too, and i don't blame
>> them.
>>
>
>Actually, I haven't, but profilers are one of the things that I want to
>get more familiar with... Thank you for the suggestions, I really
>appreciate them having done limited parallel programming on this scale.
I'm not referring to profilers but to for example first compiling with for
example:
# gcc 3.3.3 (suse) in case of x86-64 :
CFLAGS = -O3 -fprofile-arcs -march=k8 -mcpu=k8
Then run your program single cpu for a while. quit your program. remote all
object files.
Recompile then with:
CFLAGS = -O3 -fbranch-probabilities -march=k8 -mcpu=k8
# note that gcc 3.4.x the 'mcpu' has been renamed to 'mtune'
Otherwise default use something like this which has the right processor
name you use:
CFLAGS = -O2 -mcpu=athlon-xp -march=athlon-xp
Take care you optimize the GMP for the processor in question, makes a
difference.
>Regards,
>-J Zamor
>jzamor at gmail.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rgb at phy.duke.edu Wed Mar 16 16:23:09 2005
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 16 Mar 2005 16:23:09 -0500 (EST)
Subject: [Beowulf] Seg Fault with pvm_upkstr() and Linux.
In-Reply-To:
References: <19430ea71246a4e9a13899873914bff7@gmail.com>
Message-ID:
On Wed, 16 Mar 2005, Josh Zamor wrote:
> --strPVM.c--
>
> #include
> #include
>
> int main(int argc, char** argv) {
> int info, mytid, myparent, child[2];
Add
char buf[3];
>
> if(mytid = pvm_mytid() < 0) {
> pvm_perror("Could not get mytid");
> return -1;
> }
>
> myparent = pvm_parent();
> if((myparent < 0) && (myparent != PvmNoParent)) {
> pvm_perror("Some odd errr for my parent");
> pvm_exit();
> return -1;
> }
>
> /* I am parent */
> if(myparent == PvmNoParent) {
> info = pvm_spawn(argv[0], NULL, PvmTaskDefault, NULL, 2, child);
>
> for(int i = 0; i < 2; ++i) {
> if(child[i] < 0)
> printf(" %d", child[i]);
> else
> printf("t%x\t", child[i]);
> }
> putchar('\n');
>
> if(info != 2) {
> pvm_perror("Kids didn't all spawn!");
> pvm_exit();
> return -1;
> }
>
> for(int i = 0; i < 2; ++i) {
> char* retStr;
delete line ^^^^^^^^^^^^^^
> info = pvm_recv(-1, 11);
> info = pvm_upkstr(retStr);
change to
info = pvm_upkstr(buf);
> printf("Recieved return string: %s\n", buf);
> }
>
> pvm_exit();
> return 0;
> } else { /* Child follows */
> char str[3];
delete, and change to
buf[0] = 'a';
buf[1] = 'b';
buf[2] = (char)0;
(noting that:
>> str[3] = (char)0;
is a bug!)
Or use
snprintf(buf,3,"ab");
Or strncpy.
> pvm_initsend(PvmDataDefault);
> pvm_pkstr(buf);
^^^
> pvm_send(myparent, 11);
> pvm_exit();
> return 0;
> }
> }
>
The code you sent looks to me to be wrong. You declare retStr to be a
char *, but there is no space associated with it and you don't
initialize it. I believe that pvm's pvm_unpkstr COPIES the returned
string into the destination rather than sets the pointer to point to
anonymous memory somewhere, since the latter would leak sieve-like.
Malloc'ing the vector should have worked also, but depending on where
str[] was allocated, just referencing str[3] could have caused a segment
violation.
If you want to see an example of a guaranteed-working program that sends
a simple string using pvm all packaged up neatly to use as a template
for doing actual work, you might look at
http://www.phy.duke.edu/~rgb/General/project_pvm.php
I put this up as an example PVM program template for a CWM column last
year sometime. It keeps the master and slave code separate (and builds
them separately). I can't see any good reason that your "fork/exec"
style code shouldn't work, though. Declaring the local variables inside
conditional segments might cause their location to be relatively likely
to cause segment violation problems when you access the string beyond
its allocated length, although I confess that USUALLY this will "work".
That's probably why it worked on the mac, and why it might work on linux
if you move things around a bit but leave things otherwise the same.
SO, two definite bugs -- must allocate the target memory block one way
or another and must not reference an allocated vector past the end.
We'll have to see if these are "the" bugs -- there may be others.
Either one would cause a segment violation in particular, though.
In a minute I'll try to compile your program and test the fixes I
suggest above (I know, should probably have done this FIRST, right?:-).
HTH,
rgb
> Also, for the character array "str" in the child segment above, I have
> tried using malloc to create the memory, and using const char arrays as
> well. While all of these methods give seg faults on the linux machine,
> the above way was the only way that I tried that worked and didn't give
> a bus error on Mac OSX.
>
> Thanks again.
>
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From rgb at phy.duke.edu Wed Mar 16 16:34:58 2005
From: rgb at phy.duke.edu (Robert G. Brown)
Date: Wed, 16 Mar 2005 16:34:58 -0500 (EST)
Subject: [Beowulf] Seg Fault with pvm_upkstr() and Linux.
In-Reply-To:
References: <19430ea71246a4e9a13899873914bff7@gmail.com>
Message-ID:
On Wed, 16 Mar 2005, Josh Zamor wrote:
(Stuff).
OK, it works perfecto-mundally. See code below:
#include
#include
#include
int main(int argc, char** argv) {
int info, mytid, myparent, child[2];
char buf[1024];
if(mytid = pvm_mytid() < 0) {
pvm_perror("Could not get mytid");
return -1;
}
myparent = pvm_parent();
if((myparent < 0) && (myparent != PvmNoParent)) {
pvm_perror("Some odd errr for my parent");
pvm_exit();
return -1;
}
/* I am parent */
if(myparent == PvmNoParent) {
info = pvm_spawn(argv[0], NULL, PvmTaskDefault, NULL, 2, child);
for(int i = 0; i < 2; ++i) {
if(child[i] < 0)
printf(" %d", child[i]);
else
printf("t%x\t", child[i]);
}
putchar('\n');
if(info != 2) {
pvm_perror("Kids didn't all spawn!");
pvm_exit();
return -1;
}
for(int i = 0; i < 2; ++i) {
info = pvm_recv(-1, 11);
info = pvm_upkstr(buf);
printf("Received return string: %s\n", buf);
}
pvm_exit();
return 0;
} else { /* Child follows */
strcpy(buf,"Testing PVM's string line.");
pvm_initsend(PvmDataDefault);
pvm_pkstr(buf);
pvm_send(myparent, 11);
pvm_exit();
return 0;
}
}
rgb at lilith|B:1074>gcc -O3 -std=c99 -g -I/usr/share/pvm3/include
-L/usr/share/pvm3/lib/LINUXI386 -o pvm_test pvm_test.c -lpvm3 -lm
rgb at lilith|B:1075>/tmp/pvm_test
t4000e t4000f
Received return string: Testing PVM's string line.
Received return string: Testing PVM's string line.
(So it was probably one or the other of the bugs I pointed out.)
rgb
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From diep at xs4all.nl Wed Mar 16 16:36:48 2005
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Wed, 16 Mar 2005 22:36:48 +0100
Subject: [Beowulf] Seg Fault with pvm_upkstr() and Linux.
Message-ID: <3.0.32.20050316223648.01639c00@pop.xs4all.nl>
At 09:30 AM 3/16/2005 -0500, Robert G. Brown wrote:
>On Tue, 15 Mar 2005, Josh Zamor wrote:
>I assume that you've experimented and have no difficulty returning and
>unpacking ordinary ints, strings, or raw data blocks with PVM. If so
>you probably aren't making a pointer error on the master server side,
>although it never hurts to check.
I wouldn't count on it being bugfree; usually it takes a long time before
people discover the beautiful 'sizeof' command in C and as we know that
will give back '4' in a lot of cases at his 32 bits XP and '8' at the 64
bits g4.
For example but not limited to:
printf("sizeof(long) = %i\n",(int)sizeof(long));
Also integer is not safe, as standards do not force it to be 32 bits.
Then there is the usual casting problem and compare problem. Notorious is
this compare:
int a; unsigned int b;
if( b == a ) // result undefined
Odds for bugs in the program are like 99.99%
>If you want other eyes on your actual code (might be useful if it is
>indeed programmer error) please post.
>
> rgb
>
>--
>Robert G. Brown http://www.phy.duke.edu/~rgb/
>Duke University Dept. of Physics, Box 90305
>Durham, N.C. 27708-0305
>Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
>
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
>
>
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From jzamor at gmail.com Wed Mar 16 15:25:01 2005
From: jzamor at gmail.com (Josh Zamor)
Date: Wed, 16 Mar 2005 13:25:01 -0700
Subject: [Beowulf] Seg Fault with pvm_upkstr() and Linux.
In-Reply-To: <3.0.32.20050316183746.0148e730@pop.xs4all.nl>
References: <3.0.32.20050316183746.0148e730@pop.xs4all.nl>
Message-ID: <3ec3474f03b446c9a78f4288e3051ec8@gmail.com>
On Mar 16, 2005, at 10:37 AM, Vincent Diepeveen wrote:
>
> Did you configure GMP correctly?
>
> For math with big numbers it default does not use FFT calculations but
> way
> slower methods. You might want to recompile it with FFT enabled in
> case you
> didn't do this yet.
>
I actually haven't done this, though I'll certainly try that soon.
>
> In general in parallel programming the worst performance you get when
> all
> processes must report to 1 central process.
>
> It's far more efficient when each process is equal and 'divides' the
> work
> done.
>
> A simple calculation example of a problem i had at a 512 processor SGI
> is
> that each 'hub' can handle at most 680MB data per second (for 4
> processors
> in total yes).
>
> However if 499 other processors start reading/writing from/to this
> 'hub'
> then real disasters will happen.
>
> Things will completely lock up. Not only because all processors must
> divide
> the small bandwidth, but also because you will get switch latency
> overhead
> problems of routers and switches.
>
> If they first must stream a few bytes data from A to B and then
> suddenly
> from C to D, that's far less efficient than when 1 switch/router must
> stream only from A to B.
>
> Switches and routers sometimes have their own cache which is optimized
> for
> those benchmark streaming tests simply. Switch latency can cause
> serious
> problems if all processors want to use the same communication
> resources.
>
> The general rule is to keep the routers/switches as less possible as
> busy
> and try to make embarrassingly as possible parallel software.
This is exactly the sort of thing that I will be looking for shortly,
do you have any recommendations on either books or online texts that
cover this sort of thing (best practices when programming for
clusters)? I'm currently just experimenting, but this is a field that I
think I want to get involved in.
>
> which compiler do you compile with?
>
> I hope gcc only and not intel c++?
>
> intel c++ is notorious with floating points in order to get faster at
> benchmarks.
>
> Are you busy with floating point or with integers?
I am currently using gcc's c compiler, ver 3.3.x. And doing mostly
integer calculations currently.
>
> Are you using PGO with gcc? (pgo = profile guided optimizations)
>
> There is major bugs even in latest 3.4.3 gcc in the PGO.
>
> Those guys are all volunteers and very cool guys.
>
> Very slow in bugfixing as they have other jobs too, and i don't blame
> them.
>
Actually, I haven't, but profilers are one of the things that I want to
get more familiar with... Thank you for the suggestions, I really
appreciate them having done limited parallel programming on this scale.
Regards,
-J Zamor
jzamor at gmail.com
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From steve_heaton at ozemail.com.au Wed Mar 16 21:42:56 2005
From: steve_heaton at ozemail.com.au (steve_heaton at ozemail.com.au)
Date: Thu, 17 Mar 2005 13:42:56 +1100
Subject: [Beowulf] Re: The move to gigabit - technical questions
Message-ID: <20050317024256.CAXF29550.swebmail01.mail.ozemail.net@localhost>
G'day all
Somewhat relevant... Part of my benchtesting exersize on my DIY beowulf was a comparison between running the onboard FastEthernet v's adding an Intel 1000MT GigaE adapter.
I changed the MPI config to ensure the MPI traffic had the GigaE to itself and "other" traffic went via the FastE.
I ran the full MPI perftest suite. Sample graphs on this web page:
http://members.ozemail.com.au/~sheaton/lss/
-> Computing
It was indeed "a bit" faster.
There's some NetPipe results in there too. I can provide more details if any one is interested.
Note: I know magic can be worked with the Intel driver but this is "vanilla" ATM.
Cheers
Steve
This message was sent through MyMail http://www.mymail.com.au
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From eugen at leitl.org Thu Mar 17 15:36:29 2005
From: eugen at leitl.org (Eugen Leitl)
Date: Thu, 17 Mar 2005 21:36:29 +0100
Subject: [Beowulf] [Bioclusters] notes and pictures from a "wet lab
baby-biocluster" project (fwd from dag@sonsorol.org)
Message-ID: <20050317203629.GJ17303@leitl.org>
----- Forwarded message from Chris Dagdigian -----
From: Chris Dagdigian
Date: Thu, 17 Mar 2005 14:42:10 -0500
To: "Clustering, compute farming & distributed computing in life science informatics"
Subject: [Bioclusters] notes and pictures from a "wet lab baby-biocluster"
project
Organization: Bioteam Inc.
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US;
rv:1.7.5) Gecko/20041217
Reply-To: "Clustering, compute farming & distributed computing in life science informatics"
I've had a blast the past few days doing rack-and-stack work that I
normally don't get to do much anymore. Rough notes and a link to the
images follow...
The pictures:
-------------
http://bioteam.net/gallery/wetlabcluster
The challenge:
--------------
In 12 days or less, design a cluster, source the parts and put it
together in good working order. The cluster must meet the following
requirements:
- Capable of operating in a wet lab setting
- Managed and operated by biologists
- Linux OS required (software dependencies ...)
- Require no more than 2x 20-amp power circuits
- ~ 4 terabyte raw storage requirement; HA or super-performance not a
requirement
- Quieter than the instruments surrounding it
- Small enough to (roughly) fit under a lab bench
- Have sufficient CPU power to meet analytical needs
- Capable of automatically processing data coming off one or more high
end instruments
The components:
---------------
I can't share details about the requirements gathering phase of the
project. We studied the instrument, the science and the stuff that
needed to be done with the data coming off the instrument and determined
that approx. six dual-processor boxes with AMD Opteron CPUs would be
acceptable. Under a massive time crunch and some components were ordered
purely on the basis of "how fast can you ship to us..."
The parts list boiled down to the following pieces:
From CDW.com with rush delivery :)
- Digi CM 16 serial console server
- Pair of 20-amp APC rack-mount power distribution units
- Dirt cheap SMC 24-port gigabit ethernet unmanaged switch
- Box of serial DB9 to cat5 RJ45 adaptors for serial console
- Bulk quantities of 5ft grey cat5e cables (no time for special colors
or lengths)
From IBM via a local reseller/integrator:
- 7x IBM eSeries 326 1U rackmount dual-Opteron servers (6 nodes + master)
From Apple:
- Apple Xserve RAID with 14x 400gb drives
- Apple PCI-X fiber channel HBA card & cables
- Xserve RAID spare parts kit
From Extrememac.com:
- Small form factor 12U "XRack Pro2" cabinet (http://www.xrackpro.com/)
The problems:
-------------
The biggest overall problem was that the Apple Xserve RAID was ordered
with Fedex shipping but without priority delivery. This means that the
storage arrived at 5pm the night before our final cluster-assembly work
day. It also arrived with damaged rackmount rails but the damage was not
enough to make the hardware unusable.
Even worse, the cluster cabinet arrived at 1pm *on* our final work day.
This was in spite of the fact that the cabinet had been ordered via
credit card directly from Extrememac 7 or 8 days prior. As a vendor,
they were not really on the ball with things but this could be normal
for a company that seems to mostly make iPod accessories. Hopefully just
a fluke experience.
The IBM hardware arrived quickly and the reseller/integrator did a good
job. A minor hassle was that we had to order 15,000RPM Ultra320 scsi
drives because the cheaper 10,000RPM drive were on some sort of IBM
global "short supply" list.
The biggest problem with IBM and the reason I'll probably never purchase
eSeries servers again is that apparently IBM refuses to sell any sort
generic rail mounting kits for the e-series product line (this is what
the integrator told me; have not verified this yet). They ship with rail
kits that *only* work in IBM branded server cabinets. Given that we were
installing into a non-IBM 12U cabinet this was a big issue. Our
integrator found a 3rd party rail reseller that makes compatible rails
but we could not order them in time. To me this is just annoying and (if
true) due to the annoyance factor I'll probably buy my dual Opterons
from Sun in the future (assuming Sun will sell me a generic rail kit...)
Final thoughts:
---------------
The 64bit version of Suse 9.2 Professional handled the fibre channel
storage amazingly cleanly. It detected, mounted and provisioned the 2
Apple RAID LUNs into a LVM group with no problem at all. I was expecting
the Linux -> Apple RAID stuff to be a bit more scary.
I really like the XRack Pro2 cluster cabinet or whatever it's marketing
name is. Well assembled with good options for choosing between quiet vs
cooling. There is plenty of space for wiring and cable runs even if all
12U are packed with equipment. We have everything powered up today and
working hard and are monitoring the temperature conditions internally.
The Xserve RAID is one the quietest storage arrays I've ever seen - I
thought it would be louder than the IBM rack-mounts but this is not the
case.
The biggest liability in this cluster is the lack of an internal UPS
capable of cleanly shutting down the Xserve RAID chassis. There was
simply no more room in the cabinet. We'll do external UPS for now and if
we can squeeze out 1 compute node there is the possibility of installing
one of the 1U UPS systems made by APC.
-Chris
--
Chris Dagdigian,
BioTeam - Independent life science IT & informatics consulting
Office: 617-665-6088, Mobile: 617-877-5498, Fax: 425-699-0193
PGP KeyID: 83D4310E iChat/AIM: bioteamdag Web: http://bioteam.net
_______________________________________________
Bioclusters maillist - Bioclusters at bioinformatics.org
https://bioinformatics.org/mailman/listinfo/bioclusters
----- End forwarded message -----
--
Eugen* Leitl leitl
______________________________________________________________
ICBM: 48.07078, 11.61144 http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE
http://moleculardevices.org http://nanomachines.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL:
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From diep at xs4all.nl Thu Mar 17 20:14:37 2005
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Fri, 18 Mar 2005 02:14:37 +0100
Subject: [Beowulf] Re: The move to gigabit - technical questions
Message-ID: <3.0.32.20050318021437.01624af0@pop.xs4all.nl>
G'morning
At 01:42 PM 3/17/2005 +1100, steve_heaton at ozemail.com.au wrote:
>G'day all
>
>Somewhat relevant... Part of my benchtesting exersize on my DIY beowulf
was a comparison between running the onboard FastEthernet v's adding an
Intel 1000MT GigaE adapter.
>
>I changed the MPI config to ensure the MPI traffic had the GigaE to itself
and "other" traffic went via the FastE.
>
>I ran the full MPI perftest suite. Sample graphs on this web page:
>
>http://members.ozemail.com.au/~sheaton/lss/
>-> Computing
>
>It was indeed "a bit" faster.
You ship 1.01E+02 = 1.01 * 10^2 = 101 bytes per block and reach roughly 240
megabit per second with it.
Interesting to know is how much SYSTEM time you lose while shipping 240
megabit per second in say 2MB blocks, not 101 bytes. 101 bytes is real real
little IMHO.
A system is obviously pretty near useless when you have zero processortime
left thanks to network.
>There's some NetPipe results in there too. I can provide more details if
any one is interested.
How much processortime is left while running netpipe.
Do your cards use DMA?
>Note: I know magic can be worked with the Intel driver but
>this is "vanilla" ATM.
Vanilla is what we need with respect to gigabit.
The non vanilla theoretic 'raw data throughput' and raw latency without
protocol overhead, that's for the highends who for sure have solved the
processor time issue already long long ago :)
>Cheers
>Steve
>
>
>This message was sent through MyMail http://www.mymail.com.au
>
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
>
>
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From laytonjb at charter.net Fri Mar 18 08:51:31 2005
From: laytonjb at charter.net (Jeffrey B. Layton)
Date: Fri, 18 Mar 2005 08:51:31 -0500
Subject: [Beowulf] [Fwd: [O-MPI users] Fwd: Thoughts on an MPI ABI]
Message-ID: <423ADCE3.7090508@charter.net>
Good morning,
I've been a little behind on reading the mailing lists, but I saw
this post on the Open-MPI mailing list and thought it might
be of interest to people on this list since this has been a recent
topic of discussion.
Enjoy!
Jeff
-------------- next part --------------
An embedded message was scrubbed...
From: Jeff Squyres
Subject: [O-MPI users] Fwd: Thoughts on an MPI ABI
Date: Sun, 13 Mar 2005 13:36:20 -0500
Size: 30766
URL:
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From atp at piskorski.com Fri Mar 18 10:11:09 2005
From: atp at piskorski.com (Andrew Piskorski)
Date: Fri, 18 Mar 2005 10:11:09 -0500
Subject: [Beowulf] Re: The move to gigabit - technical questions
In-Reply-To: <3.0.32.20050318021437.01624af0@pop.xs4all.nl>
References: <3.0.32.20050318021437.01624af0@pop.xs4all.nl>
Message-ID: <20050318151108.GA74656@piskorski.com>
On Fri, Mar 18, 2005 at 02:14:37AM +0100, Vincent Diepeveen wrote:
> At 01:42 PM 3/17/2005 +1100, steve_heaton at ozemail.com.au wrote:
> >Note: I know magic can be worked with the Intel driver but
> >this is "vanilla" ATM.
>
> Vanilla is what we need with respect to gigabit.
Well no, it's not. When he said "vanilla", I believe he meant, "Using
the stock Linux driver settngs with no attempt at tuning for the needs
of my particular HPC application." Thus he has done only part of the
work necessary to fully answer the question, "How much better are
these gigabit cards than using 100 megabit ethernet for me".
Since he is using Intel Pro/1000 cards, he probably should also try
using GAMMA rather than TCP/IP.
--
Andrew Piskorski
http://www.piskorski.com/
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From diep at xs4all.nl Fri Mar 18 15:38:04 2005
From: diep at xs4all.nl (Vincent Diepeveen)
Date: Fri, 18 Mar 2005 21:38:04 +0100
Subject: [Beowulf] Re: The move to gigabit - technical questions
Message-ID: <3.0.32.20050318213803.0163a370@pop.xs4all.nl>
At 10:11 AM 3/18/2005 -0500, Andrew Piskorski wrote:
>On Fri, Mar 18, 2005 at 02:14:37AM +0100, Vincent Diepeveen wrote:
>> At 01:42 PM 3/17/2005 +1100, steve_heaton at ozemail.com.au wrote:
>
>> >Note: I know magic can be worked with the Intel driver but
>> >this is "vanilla" ATM.
>>
>> Vanilla is what we need with respect to gigabit.
>
>Well no, it's not. When he said "vanilla", I believe he meant, "Using
>the stock Linux driver settngs with no attempt at tuning for the needs
>of my particular HPC application." Thus he has done only part of the
>work necessary to fully answer the question, "How much better are
>these gigabit cards than using 100 megabit ethernet for me".
>
>Since he is using Intel Pro/1000 cards, he probably should also try
>using GAMMA rather than TCP/IP.
I'm sorry but i fail to be able to take GAMMA serious at the moment.
All my machines here are dual machines. No chance they can work with GAMMA,
last time i checked that is.
>--
>Andrew Piskorski
>http://www.piskorski.com/
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
>
>
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From nelsoneci at gmail.com Fri Mar 18 15:52:19 2005
From: nelsoneci at gmail.com (Nelson Castillo)
Date: Fri, 18 Mar 2005 15:52:19 -0500
Subject: [Beowulf] Can I write Etherboot to MBR?
Message-ID: <2accc2ff050318125257e783b5@mail.gmail.com>
Hi.
I'm booting a lot of machines in a diskless clusters.
I don't want to wipe the actual content of hardisks. I'd like to
overwrite the MBR without using a boot loader.
I've been able to use a zlilo image with both grub and lilo.
I've been able to use a zdsk image booting from a floppy.
But I just want to do something like:
cat eepro100.zdsk > /dev/sda
just as I do
eepro100.zdsk > /dev/fd0
Can I do this?
Regards,
Nelson.-
--
Homepage : http://geocities.com/arhuaco
The first principle is that you must not fool yourself
and you are the easiest person to fool.
-- Richard Feynman.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From linuxslacker at gmail.com Fri Mar 18 18:52:14 2005
From: linuxslacker at gmail.com (Chris Peterson)
Date: Fri, 18 Mar 2005 16:52:14 -0700
Subject: [Beowulf] Gigabit Switch
Message-ID: <219b037a05031815521e616dbb@mail.gmail.com>
Hi,
This is a little off topic but..
We have a gigbit switch that needs to be replaced because it does not
suppoert jumbo frames. Does anyone know of a good 24-port switch with
a mini-GBIC port. We have looked at the SMC8624T, but it is out of our
price range.
Chris Peterson
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From cap at nsc.liu.se Fri Mar 18 04:12:07 2005
From: cap at nsc.liu.se (Peter =?iso-8859-1?q?Kjellstr=F6m?=)
Date: Fri, 18 Mar 2005 10:12:07 +0100
Subject: [Beowulf] Re: The move to gigabit - technical questions
In-Reply-To: <20050317024256.CAXF29550.swebmail01.mail.ozemail.net@localhost>
References: <20050317024256.CAXF29550.swebmail01.mail.ozemail.net@localhost>
Message-ID: <200503181012.13311.cap@nsc.liu.se>
Hello,
e1000 is a really good performer _IF_ you switch of ITR
(InterruptThrottleRate) when loading the module.. (this is not default).
Try adding the following to your modules.conf (or modprobe.conf if 2.6)
options e1000 InterruptThrottleRate=0,0
(use =0 if one NIC, 0,0 if two, 0,0,0,0 if four... and so on).
/Peter
On Thursday 17 March 2005 03.42, steve_heaton at ozemail.com.au wrote:
> G'day all
>
> Somewhat relevant... Part of my benchtesting exersize on my DIY beowulf was
> a comparison between running the onboard FastEthernet v's adding an Intel
> 1000MT GigaE adapter.
>
> I changed the MPI config to ensure the MPI traffic had the GigaE to itself
> and "other" traffic went via the FastE.
>
> I ran the full MPI perftest suite. Sample graphs on this web page:
>
> http://members.ozemail.com.au/~sheaton/lss/
> -> Computing
>
> It was indeed "a bit" faster.
>
> There's some NetPipe results in there too. I can provide more details if
> any one is interested.
>
> Note: I know magic can be worked with the Intel driver but this is
> "vanilla" ATM.
>
> Cheers
> Steve
>
>
> This message was sent through MyMail http://www.mymail.com.au
--
------------------------------------------------------------
Peter Kjellstr?m |
National Supercomputer Centre |
Sweden | http://www.nsc.liu.se
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL:
-------------- next part --------------
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
From jzamor at gmail.com Fri Mar 18 10:29:40 2005
From: jzamor at gmail.com (Josh Zamor)
Date: Fri, 18 Mar 2005 08:29:40 -0700
Subject: [Beowulf] Seg Fault with pvm_upkstr() and Linux.
In-Reply-To:
References: <19430ea71246a4e9a13899873914bff7@gmail.com>