Okay, we didn't fix the HE connections problem, but are getting closer to understanding what's going on. Basically our router down at the PAIX keeps getting a corrupted routing table. We reboot it, which flushes the pipes, but this only "evolves" the issue: people who couldn't connect before now can, but people who could connect before now cannot, or people don't see any change in behavior. This is likely due to a mixture of: (a) low memory on this old router, (b) our ridiculously high, constant rate of traffic, and perhaps also (c) a broken default route.

We're looking into (c) at the moment, and solving (a) may be far too painful (we don't have easy access to this router, which is a donated box mounted in donated rack space 30 miles away). So I've been arguing that we need to deal with (b) first, i.e. reduce our rate of traffic.

Part of reducing our traffic means breaking open our splitter code. Basically, one of the seven beams down at Arecibo has been busted for a while, thus causing a much-higher-than-normal rate of noisy workunits. We've come up with a way to detect busted beam automatically in the splitter (so it won't bother creating workunits for said beam) but this means cracking open the splitter. This is a delicate procedure, as you can really screw things up if the splitter is broken - and usually needs oversight from Eric who is the only one qualified to bless any changes to it. Of course, Eric has been busy with a zillion other things, so this kept getting kicked down the street. But at this point we all feel this needs to happen, which should reduce general traffic loads, and maybe clear up other problems - like our seemingly overworked router facing HE unable to handle the load.

Of course, it doesn't help we're all bogged down in a wave of grant proposals and conferences, and I'm having to write a bunch of notes as part of a major brain dump since I'm leaving for two months (starting two weeks from now). I'll be on the road (all over the Eastern North America in September, all over Europe in October) playing keyboards/guitar with the band Secret Chiefs 3. It's been a crazy month thus far getting ready for that.

- Matt-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

Maybe we should shoot for replacing the on campus routers as well. In the event high bandwidth could be utilized.SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the BP6/VP6 User Group today!

While cleaning up the work units is important, that is at best a temporary patch.

You said donated rack space and router. Are you saying if the donated router was replaced the rack space donation would go away? Or is the rack space such that no other box will fit? Just trying to get the political angle if any sorted.

Another question that comes to mind is the box on the latest version of software from the manufacturer? Perhaps an update for this has been issued and it needs to be flashed.

[edit]Isn't that box a gig router? Shouldn't it be able to handle 10X the traffic (routing table) it is now getting? RAM failure?

I'll be on the road (all over the Eastern North America in September, all over Europe in October) playing keyboards/guitar with the band Secret Chiefs 3. It's been a crazy month thus far getting ready for that.

Les Claypool? Faith No More? Who know you were so cool Matt? Let me know when you make it to Ozzy Osbourne and I'll book the first flight out there and be your groupie, carrying your instruments. ;-D

Part of reducing our traffic means breaking open our splitter code. Basically, one of the seven beams down at Arecibo has been busted for a while, thus causing a much-higher-than-normal rate of noisy workunits.

How many units per 1,000 (or 10,000 etc) are noisy?
I myself haven't seen many noisy Work Units- a couple of times i've had a group of 5-8 that only ran for 10-20secs before finishing- but this is out of 2,500-3,000 Work Units in the cache at that time. What i have been seeing is a lot more shorties than in the past- ie WUs that take only 3 min or less to run on my video card. The work mix, apart from a burst of almost nothing but shorties a week or two a go, appears to be a mix of some longer running WUs but a much higher percentage of shorter running WUs, with very little middle of the range WUs at all (although over the last few days there have been a few more mid runtime WUs than there have been).

Also the fact that my caches have only been full a couple of times over the last month would also be contributing to the extended periods of high traffic- the frequent glitches that have been bringing the system down, or limiting WU production, or limiting it's allocation mean that the caches just haven't had a chance to re-fill.
Many of the faster systems would be having the same problem- the cache just isn't filling up as something occurs that stops it from getting work- but it's still processing work at a significant rate. So when Seti comes up again, it's wiped out the gains it made in builkding up it's cache since the last outage.

I suspect that if the system was able to remain up without any slowdowns in work production or allocation between weekly outages then after 2 weeks the traffic would drop off considerably as all the faster machines would have finally been able to fill their caches.Grant
Darwin NT

It seems I see a pattern of hit/miss efforts to fix problems that are never really pinned down.

Before the rightous all chime in here let's seriously look at what's going on. There are a lot of good companies who would contribute hardware and maybe even communications hardware and bandwidth. It just seems the people within the project while doing yeomans work for little reward are simply not getting the support, funding, and outside helpt this project deserves.

We seem to have but a few who even bother to communicate what's going down and that will shortly meet a two month hiatus as Matt hits the road for his music tour.

Where are the principles pn this team and why can't they manage to set up a regular weekly update and a regular status update and troubleshooting direction.

A lot of us out here are available for more that CPU cycles if we're comunicated with. We might even be able to dissect the system if someone would take the time and block chart a schematic and layout of the systems and the hardware.

Those of you that are offended by my viewpoints needs not throw electrons as frankly I am not a happy SETI at Home camper anymore. I'm aware of the limited manpower, limited funds, and so on. I also expect that this project get handled by people with their best work.

A rime example is the dated information on the website and how its not current at all.

What for example has come from the Data acquired afew weeks back. HAs it even been analysed at all? How about some feedback on the front page...

Sorry, with my three PC's working on SETI and Einstein I am getting a better feel looking for gravitational waves than I am looking for evidence of communications between ET's.

Never engage stupid people at their level, they then have the home court advantage.....

I'll be on the road (all over the Eastern North America in September, all over Europe in October) playing keyboards/guitar with the band Secret Chiefs 3. It's been a crazy month thus far getting ready for that.

I don't know if it's even possible, but during the software blanking stage of cleaning the tapes up before splitting them, is it even possible to prevent APs that are 100% blanked from even being sent out? That's 16MB of data transfer that can be saved for every WU affected by it. My machines are nowhere near power crunchers, but I still pick up a handful of those WUs back-to-back every now and then.

I guess something along the lines of a CSV file that says blanking starts at this offset and lasts for this many bytes, so when the splitters come along, they look at the CSV table and see what sections of the tape to skip entirely (obviously only sections where a whole WU will be 100% blanked).

Just a thought.Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)

It has a Phenom II X6 1055T CPU and three Cayman GPU's, most of it's MB CPU tasks are inconclusive/invalid, as are it's ATI Astropulse GPU tasks,
It has 765 AP tasks in it's list, 24 of those are valid, 60 invalid, and 673 pending tasks most of which only took a few hundred seconds if that, and it still has a Max tasks per day of 100 for ATI AP,
The counts for CPU Multibeam are hardly any better, and that also has a Max tasks per day of 100,

Hello to all. Since the project is down at the moment, and I am out of work, I think I'll just back out of the program for a short time. Got a lot of things going on here at the moment, and so, I'm shutting down for a while. All work units that I had are now, "CRUNCHED" and sent back in. I'll be back though ! Thanks !

I have been crunching with SETI@home on and off for a while now. Whenever there was major issues that brought the project down, these guys (Matt et. al.) always pulled through to get the project up and running again, whether it be sooner or later. I have faith that they will pull through this problem as well.

It's a shame there are not more donations both financially and in terms of hardware. So sad to see a good program go underfunded. If I ever run into a lot of money, you can bet I would make a large donation to the project."By faith we understand that the universe was formed at God's command, so that what is seen was not made out of what was visible". Hebrews 11.3