Despite efforts to reduce the outage time yesterday, the database was bloated enough (for various reasons) to take all day compressing/backing up. The replica wasn't even close to being ready to done by the time I left the lab, and still wasn't done before I went to bed last night. That meant all queries had to be aimed at the master, including all the read-only stuff that usually hits the replica - stats collection scripts, result state count scripts, the daily credit multiplier calculation (which is rather expensive), and lots of annoying web scraping queries.

All those excess things pretty much killed us throughout the evening. The replica was finally available in the morning, albeit fairly far behind the master. Nevertheless I was able to start cleaning up the mess. However, two other problems were revealed.

First, going to one download server wasn't a good thing. It seems impossible to me that apache can't handle all the downloads on one system - especially given the abundance of free resources. It drops connections regardless of how much network/httpd.conf tweaking I do. So we fell back to using two download servers, and that immediately solved everything. Of course, we've been offline for 24 hours, so there's gonna be lots of traffic for a while making it hard to upload/download anything.

Second, there was minor corruption in the MyISAM tables in the mysql database. Not sure what caused that but given the database was clogged all night all bets are off. The most notable effect of this was some weird behavior in the forums. Some simple "repair table" commands found the problems and claims to have fixed them.

Anyway.. it's clear we still have much work to do cleaning up our current mysql situation. Sigh.

In better news, looks like me and Jeff are going to the OSCON 2009 in San Jose in July - the O'Reilly open source convention. Maybe we'll get some hot tips about improving the linux/apache/mysql/php performance around here. Tim O'Reilly himself helped hook us up with free passes (he's been nice to us over the years).

- Matt-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

Ok, the down times are getting old. I prefered to have only one project running and it appears this is not the one to to maximize my computers when they are not in use. Guess it is time to find a different and more productive project, it has been a fun run.

Ok, the down times are getting old. I prefered to have only one project running and it appears this is not the one to to maximize my computers when they are not in use. Guess it is time to find a different and more productive project, it has been a fun run.

No SINGLE project can do that. I've got 10 attached and all of them have had extended down times. It is the nature of the beast.

Downloads have been on the frisk for quite a while, even after the re-introduction of 2 download servers. It's symptoms are similar to the problem when uploads are maxed out.What if Fiction was Fact and Fact was Fiction and vice versa?

Before someone says it was all better in Classic times (I'm just waiting for it... and am heading you off. ;-)), remember that they had these down times in those days as well, there was just less communication about it.

All the work out there was then just recycled over and over and over and over - up to 50 times over - if at least you were connected to one of the servers doing all that recycling. And else your Seti program wasn't doing anything either. Jord

Ancient Astronaut Theorists suggest that in many ways, you can be considered an alien conspiracy!

Yes, and many associate SETI@home with the SETI Institute. I had to explain to an Italian newspaper that it is not so, but I saw that Scientific American made the same mistake in an article promoting Docking@home.
Tullio

It's all about timing. Rejoining in the middle of an outage would be unnerving to say the least. Give it some time to heal, and all will be well until the next hiccup.

Seti is my primary project, but I have others in reserve if it drops off for a while.

As was said above S*** Happens, AND USUALLY AT THE MOST INAPPROPRIATE TIME. My 10 day cache's have weathered everything I have encountered so far as far as outages, and letting other projects run helps too. Sure my RAC has dropped a few thousand, but it will be back! (lost a quad as well!)

Perhaps the question to ask is why the glitch rate remains (apparently) high after all this time (years). It appears to those of us who know nothing that things are tweaked frequently and almost as frequently the first tweak begets a second tweak or an untweak, and so on. Yet after I don't know how many years the data we've processed remains un-analyzed. It is very frustrating.

My remedy has been to connect only in the wee hours of each Berkeley evening and keep a 10 day cache, so as to minimize the server chaos impacting my hosts' productivity. I also try to down shift my attitude before reading these boards so that I don't emotionally red line. Afterall, it is a 'hobby'. And, I think I'm going to turn off some of my old beasts for a while or forever; unlike the original premise of s@h, they are probably just adding more thermodynamic entropy to the universe than they can justify with seti "science"

Well I'll be a monkys uncle! I 've been waiting for Bionic to download 3 new jobs for the past 3 days with some silly messages like wrong size, server down and so on... now I know! just Exit Bionic restart Bionic and HELLO!!!! everything is alright! there seems to be nothing really wrong with seti (breath easier guys) it seem the problem lays with Bionic! I may be wrong here but what the hell...