First the good news. I have thumper all configured and ready to roll as our mega file server. In fact it's already rolling. Note this isn't a public facing server, but will indirectly help the various public services in many ways, including making the sysadmins working on SETI@home/BOINC a lot happier in general. Lots of really fast disk storage for database backups, raw data transfer buffers, doesn't randomly reboot itself like our current home account server, etc.

Mmmkay. Now the less good news. Looks like gowron is having some fundamental RAID issues. The issues has been whittled down to one RAID1 pair tagged as degraded that won't rebuild no matter what we do. THe guys at Overland have been super helpful - but this is actually an old SnapAppliance (not a box that Overland sells) and running a (very) old version of the OS. So it's looking like our best bet to move forward is to upgrade the OS on the thing. However to do so we need to copy the workunits on the system (about 2 terabyte's worth) elsewhere temporarily. How about... thumper! That copy process is happening now.

Meanwhile, we'll be off for the foreseeable future. Like at least until next week, I imagine. Bummer.

- Matt-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

Thanks for the news Matt. I've been checking in every few hours, but looks like my little H.P. computer will have a nice, long, cool down break. Wish that there was something I could do to help you, but I don't know a thing about a server. Hope all goes well with the repairs.

Can't we do better then that. About 6 hours to copy the data one way. You should not have to copy it back as it shouldn't be distroyed.

In a perfect world, yes.

But the data is coming off a degraded RAID, and it's talking over NFS, and it's competing with various other must-get-done backups writing to the same device, and it all will in fact be destroyed as this OS upgrade on the broken system (going up 2 major versions) will wipe out all current RAID configurations to make way for the larger root filesystem. And then we'll have to copy the data back.

- Matt-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

Dang my cache will be full of other projects by then and everything will go into EDF mode.

If you want to crunch SETI above all else, why on earth would you want to punish other projects by allowing them to fill a SETI-sized cache and then run into deadline trouble?

Turn the cache down while you know there's no work, then turn it back up - gradually - once SETI is back and work is flowing. No EDF, no deadlines missed, fastest possible return to SETI crunching, least stress on the download servers and comms. What's to lose?

It's a shame to hear that that we will be down until next week, but if it has to be then it has to be. However, if as a by-product, Thumper is now helping to keep SETI@home/BOINC sysadmins happy, a daunting task at any time, then that is worth a few brownie points on its own!

I don't know about anyone else, but until fairly recently I thought the sum total of Seti's kit was as listed on the server page. I didn't realise there were other non-public facing machines like Gowron behind the scenes. Clearly the Seti project is even more complicated to Admin than I had previously thought.

Could we have a similar list of backroom kit, saying what they are and what they do?Those are my principles, and if you don't like them ... well, I have others.
Groucho Marx 1895-1977

I also have mine, and if you don't like them ... tough, live with it.
Chris S 2016

Echoing KWSN Ekky above, it's just down to the cuda units being chewed up from SETI, one-by-one now. Figured it was time to start punishing the EINSTEIN@home servers and down loaded a bunch of their work units. (One project or another will have available work.) Take your time and get the job done right. Can't wait to start building up the Pending list again. :-)

And thank you for both the info and the never-ending efforts of the whole SETI crew. You are the original True Believers.

It's a shame to hear that that we will be down until next week, but if it has to be then it has to be. However, if as a by-product, Thumper is now helping to keep SETI@home/BOINC sysadmins happy, a daunting task at any time, then that is worth a few brownie points on its own!

I don't know about anyone else, but until fairly recently I thought the sum total of Seti's kit was as listed on the server page. I didn't realise there were other non-public facing machines like Gowron behind the scenes. Clearly the Seti project is even more complicated to Admin than I had previously thought.

Could we have a similar list of backroom kit, saying what they are and what they do?

Best I could locate on a fast trip through the thread and the about pages, there have been a few changes since then.

I have a feeling that your download servers are gonna be slammed the second they come back up. I have over 2100 tasks ready to be returned on all my computers combined and my RAC is a pittance compared to the big boys, who will have many many times more. I hope that doesn't crash everything again. I'm sure y'all are clever enough to prevent that though.

As always, thanks for all that you mighty admins do to keep things up and running and I'll be here ready to crunch more tasks when you get everything back online. :)

Too bad you've been having so much trouble with storage lately, all the hard drive and RAID issues. Is there a reason you don't use a SAN solution or is it just a matter of funding for all the hardware that would involve?

Hope everything comes up better and stronger when the repairs are done.