From saville at comcast.net Mon Feb 18 14:42:11 2008
From: saville at comcast.net (Gregg Germain)
Date: Tue Nov 9 01:14:29 2010
Subject: [scyld-users] Cluster up - no action on slaves
Message-ID: <47BA09C3.6090503@comcast.net>
>> I have the freeware version of SCYLD Beowulf up and running on a 5
>> node system. I've added the 4 slaves to the Master using Beosetup. >>The
>> slaves boot and the status monitor shows them as being up. I can ping
>> them using their IP address. I ran the beofdisk, beoboot-install, and
>> bpctl commands as instructed by SCYLD.
>> 3) I ran a simple Hello World program (on the Master and two
>>slaves),
>> using MPI calls (not BeoMPI) and I get the following output:
>>
>> $ mpirun -np 3 HelloWorld
>> I am the Master! Rank 0, size 3, name localhost.localdomain
>> Rank 1, size 3, name .0
>> Rank 2, size 3, name .1
>> So things SEEM to be working. However the Beowulf Status Monitor
>> statistics portion of the Slave nodes never budge. Ok maybe the
>>program
>> runs too quickly to get a reaction.
>It's likely that you are not seeing anything on the display because
>your program is so trivial.
That's what I thought. So I wrote a simple program with a big delay
loop. Code fragment:
startwtime = MPI_Wtime();
for (ii=0; ii<1000; ii++)
for (jj=0; jj<1000; jj++)
for (kk=0; kk<1000; kk++)
n++;
endwtime = MPI_Wtime();
sprintf(greeting, "Slave: rank %d of %d running on node: %s N-val
is: %f for an elapsed loop time of: %f\n",
rank,size,name,n, endwtime-startwtime);
MPI_Send(greeting, strlen(greeting)+1, MPI_BYTE, 0,1,
MPI_COMM_WORLD);
The results are ("einar" is the SCYLD Master, or -1, node):
$ mpirun -np 4 Loop
Hello World: rank 0 of 4 running on node: einar N-val is: 0.000000
Slave: rank 1 of 4 running on node: .0 N-val is: 1000000000.000000 for
an elapsed loop time of: 74.098218
Slave: rank 2 of 4 running on node: .1 N-val is: 1000000000.000000 for
an elapsed loop time of: 74.061121
Slave: rank 3 of 4 running on node: .2 N-val is: 1000000000.000000 for
an elapsed loop time of: 74.063888
so each run takes over 74 seconds and still there's no reaction on the
Beowulf Status Monitor for the slave nodes.
So what does it take to get the slave entries on the Beowulf Status
Monitor to come to life?
> ..............so it's best to change the Beostatus update >period to
once per second.
How does one change the update period?
thanks
Gregg