Another catchup post. I'm still trying to page in everything I missed in July - it doesn't help that shortly after the last post I got a nasty summer cold. I'm back in business now.

We had another mysql database server crash over the weekend, which Jeff handled remotely without much ado. The upload server also had its directly attached storage array freak out again. This is becoming a common event, resulting in the software RAID getting in some funky state (which has always been reversible thus far).

Other than that, the servers are still chugging along. As for the grand server shuffle, progress has been made and a definite plan is in motion. Basically marvin is becoming bambi (the Astropulse database) and bambi is becoming bruno (the upload/BOINC admin server) and bruno is being turned off. Meanwhile some new machine (we'll acquire somehow) will become thumper (the science database) and thumper will become ptolemy (internal file server) and ptolemy will shut off. Getting bruno and ptolemy out of the picture means two of the three servers prone to random crashes/hardware issues will no longer be on line. The third such server is mork, which is the only server remotely close to handling the mysql database load, so no options for fixing that anytime soon. We have our hands full anyway fixing what we got.

I also (finally) got a test suite working for all my birdie tests (i.e. putting a fake signal or "birdie" in the raw data, blanking it, splitting it, then running clients on it to see if the birdie still appears). This took me a while as I had to remember all the various bits and pieces of this puzzle, some of which I haven't touched for months. Now that it's all in one big script, which is nice. Oh yeah I also parallelized the software blanking pre-processing, so new data can get on line twice as fast as before (if resources are available).

Jeff's going to put some newly compiled Astropulse back end services on line tomorrow. Hopefully that's all good or else we'll likely run out of work over the weekend (which happend last weekend, but was mostly hidden by the mysql database server crash).

It's summertime, so people are in and out of the lab a lot, but enough of us will be in one room at the same time next week that more meaningful plans/management discussions will take place regarding NTPCkr and other scienctific analysis stuff.

- Matt

____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

It's summertime, so people are in and out of the lab a lot, but enough of us will be in one room at the same time next week that more meaningful plans/management discussions will take place regarding NTPCkr and other scienctific analysis stuff.

RAC chasers, be afraid, be very afraid. Last time that happened we got three day breaks! :)

only one thing I don't understand. There is a lot spoken about not enough resources and yet you discharge two servers (again). Why don't you use it for services what the can do. Even it is only one or two services. I really don't understand it. Why not using older servers for one or two services like a one mb and ap splitter on the servers. When you but so much stuff and services on a server you are depending that the server must work and if you split it one much more server you have more a failsave if one goes down that the hole project don't go offline!!!

And reading your tech post for 2 years now I know that you have a lot of old server in your basemant down at Berkeley!!!

Don't understand me wrong, but I find it mindbodering when I read everytime when there is something wrong.

So now the question: Why don't you split the services more on the old servers???
____________
I do what I can and I can what I do! :P

So now the question: Why don't you split the services more on the old servers???

Because they are unreliable & keep failing.
And then people get all upset when they can't upload or download or report or all three untill the servers have restarted & the databases have been checked & repaired if necessary.
____________
Grant
Darwin NT.

So now the question: Why don't you split the services more on the old servers???

Because they are unreliable & keep failing.
And then people get all upset when they can't upload or download or report or all three untill the servers have restarted & the databases have been checked & repaired if necessary.

And when the do fail, the scientists at S@H become the most educated IT department in the world, and spend too much time on all the things mentioned above, when they should be looking for ET.
____________

The servers are the main problem. Several are pre-production units.
Once they've got servers that can be depended on, then they can spend more time working on the software.
____________
Grant
Darwin NT.

The servers are the main problem. Several are pre-production units.
Once they've got servers that can be depended on, then they can spend more time working on the software.

Donated Pre Production Servers at that, Hopefully there is enough for 1 or 2 good production blade servers of the type Seti needs to get, Last I heard $7,000 was raised thanks to 1 loud mouth and 6 others, Maybe Seti's equivalent of "the Magnificent Seven"... I have one old Pre Production cpu running My current setup, Which is awaiting It's retirement from crunching, But the next cpu is having to wait until supporting parts are acquired and outfitted before their gone and so I wait, patiently as I have lots to get done before the computer purchases can begin so that I can be done with this old hardware.
____________My Facebook, War Commander, Under Dog