Okay then. The mysql commit behavior we were testing was an absolute failure - though for expected reasons (not enough disk i/o, even with the solid state drives). It was worth a shot, but we fell back to the old commit behavior for now.

However, this caused a lot of backend processes to clog up including the transitioners, which ultimately meant the splitters burned through all kinds of raw data files before they realized we had more than enough work on disk. This could have been bad, i.e. filled up our workunit storage server, but luckily it didn't even come close to doing that.

Anyway, we reverted this morning and all the dams broke for a while... until we ran out of work to send out. Turns out the last 10 files I brought up from Arecibo are all broken. Fwa wa wa waaaaa. This is particularly frustrating as I was busting my hump trying to get enough work on line before the long holiday weekend, and now we have zero. So it'll be to me and Jeff to check in over the next few days and kick the pipeline along. We'll be out of real work to send out until this evening at the earliest, and quite probably hit long periods of no work throughout the weekend. Fine.

In better news, we did the last bits to get the Astropulse signal table fully copied over to another database fragment - only losing a few rows here and there (as opposed to many thousands as originally thought). Work will resume on Monday to make this exchange old/new fragments and hopefully the science database will be much happier.

That's it for now.

- Matt-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

So it'll be to me and Jeff to check in over the next few days and kick the pipeline along.

Are you crazy? Go on, have that same extra long weekend that everyone else has - medical, fire and police people excluded - as I am sure people's computers can do without work for a while, while the people being boss of said computers have got plenty of projects to choose from to overcome any workflow problems Seti has.

Have a fine weekend and a Happy Thanksgiving, Matt (and Jeff). :-)Jord

Ancient Astronaut Theorists suggest that in many ways, you can be considered an alien conspiracy!

You're talking a lot about MySQL problems on that new monster of yours, and in case you weren't aware, I just thought you should know that MySQL scales horribly on a large number of cores. It can barely scale to 8, so there's virtually no chance that you're getting good results on 24.

Sun has an "official" writeup of a scaling attempt here: http://blogs.sun.com/mrbenchmark/entry/scaling_mysql_on_a_256

Their suggestion? Run several instances of MySQL on the same server. Which is a bit meh, but if you insist on running MySQL on a high-performance project like this, it might be worth looking into splitting up the most trafficked tables into separate instances. Then you could also enable the safe commit behavior individually on the critical servers.

You've also mentioned running into replica problems on jocelyn after a primary crash, which is very common. Here's a trick to get around it, at least temporarily.

After the crash, feed jocelyn the following through the MySQL console:

Replace [NEXT FILE] with the name of the first binary log file on mork that it started writing after the crash, typically mysql-bin.002342 or something similar. You can get the name by running SHOW MASTER STATUS; on mork. This will skip the corrupted end of the previous binary log, and restart replication from the new file.

Then verify that it's running with a SHOW SLAVE STATUS;

Note that you *cannot be sure* that the state on the replica and the primary is now consistent unless you are using safe commits, however it should still be consistent enough for non-science use.

Seti is always offline it seems, I was enticed by emails to come back, but it seems the same thing is happening that made me leave in the first place. I've been with seti for a long time (9) years, and I'm sad to see so many problems with servers going down lately... can't you guys find some one smart enough to fix stuff? and buy lasting equipment, as my server machine is still top shape after 6 years. I'd be willing to do help seti in my spare time, if only I lived that far west, I'm sure others would too so don't rant to me about "costs" and the lack of funding.

Seti is always offline it seems, I was enticed by emails to come back, but it seems the same thing is happening that made me leave in the first place. I've been with seti for a long time (9) years, and I'm sad to see so many problems with servers going down lately... can't you guys find some one smart enough to fix stuff? and buy lasting equipment, as my server machine is still top shape after 6 years. I'd be willing to do help seti in my spare time, if only I lived that far west, I'm sure others would too so don't rant to me about "costs" and the lack of funding.

They have no money for anything - the entire budget is from donations at the moment. Servers are mostly donated...BOINC WIKI

... ,I was enticed by emails to come back, but it seems the same thing is happening that made me leave in the first place. ...

Same here. I have been with SETI from 1999 to 2004 and received a "we need your help" email a few months ago. Unfortunately I can only donate my computer time, but not money, and it seems (at least to me) that SETI either needs more donations or the existing donations are spent unwisely.

It is certainly none of my business, but the thought crossed my mind, is there info available about the funding situation and where the money goes ? It also seems people are just interested in keeping the computers going to gain credit instead of actually finding ET. Isn't this what's it's all about ? And how are we going to do this without getting new data from Arecibo ? It seems pointless to continue and to waste millions of KW hours

... ,I was enticed by emails to come back, but it seems the same thing is happening that made me leave in the first place. ...

It also seems people are just interested in keeping the computers going to gain credit instead of actually finding ET. Isn't this what's it's all about ?

For some people, yes; what you need to remember is S@H is now only one part of a larger Distributed computing project called Boinc; A lot of people are only interested in the number-crunching game- not the individual merits of any one project. Don't believe me? Just check out the shoutbox on Boincstats: there's no loyalty to individual projects, just credit- chasers boasting about the latest mullti-million credit day- on BOINC. S@H has become a victim of its own pioneering.

Seti is always offline it seems, I was enticed by emails to come back, but it seems the same thing is happening that made me leave in the first place

Then you didn't understand what was going on then, and you don't understand now.

Setting aside the funding issue, somewhat.

BOINC projects are supposed to do big science on very small budgets. That means that they don't do things like redundant server clusters, multihomed sites and all of the (expensive) things that hide the occasional outage that you might see with Amazon or Space.com.

If you have work in your cache, a server outage is a minor inconvenience. It's interesting to know about, but that's all.

The work isn't time critical, so getting reported later is no big deal.

I wish people would start realizing how well the entire system (client and server) work when none of the individual components are 99.99% reliable.

In response to some less-than-positive comments about the state of SETI@home, its servers, not being able to obtain work, etc., I myself volunteer use of my computer because I feel like it. No one is twisting my arm to do so. I maintain a two-day work cache in anticipation of the occasional outages, along with participation in other BOINC projects. I rarely run out of work. If I did, no big deal.

I don't feel that the folks at SETI@home owe me some sort of debt of gratitude. On the contrary, I feel privileged to be able to participate in any of the BOINC projects.