Because I don't want to download those rather large files like user_id.gz (23-Jun-2003, 20:18 : 154MB!) to create some little statistics for my little team.
I think, many little teams couldn't afford daily downloads/parsing in such huge amounts like the greater ones (e.g. boinc.dk) can, as they would just download the segments where the really needed information about their team/users is stored.

I agree with you (and I'm from one of those larger teams - Overclockers UK). I understand that the old system couldn't handle the number of users/teams in the public project (that was evident from the 19,000+ stats files), but this has gone from one extreme to the other.

If lots of teams start downloading 150MB+ every day to update their stats, that can't be good for Berkeley's servers. That isn't going to happen though because most teams don't have the resources to download and process 150MB (compressed) worth of XML stats every day. That's well into dedicated server territory (there's no way the average hosting account could handle that).

I understand that this wasn't a particularly high priority in getting the project released, so I'm just expressing my opinion that the solution we have now isn't workable. I hope it's just a temporary 'solution' while the project gets on it's feet.

I'll look at a dedicated server if I must, but I'd rather not - at least for now.

If they could split them into pieces of ~2-3MB, this would make this problem a bit easier. But if the number of records walks off into thousands or tens of thousands, it would be better to spread those records among several servers/mirrors. (First thousand here, this thousand there, that thousand... hmm... ah, over there... - and so on)

Shoot, I download 7 team stats once a day, and then, at non-peak hours.

those 7 team stats, mine, and 3 on either side of me (to watch the competition)
now if those 7 teams are not within x number of id's with each other, I'm looking at 7-14 megs alone. Not counting the fact that there is no team_standings list that I can reference to even know which teams are around me. So that would mean I have to download ALL the files just to find that out.

And I wont even get INTO all the seti-banners out there that simply provide individual user stats. PHP wont parse a 158 meg file to grab 3 lines of stats.

Something has to be done or all those small stats/news/etc pages out there, which REALLY encourage their users, and make the whole project fun, are going to die out.

I think they still need stats divided into individual teams, perhaps each 100 teams in a different directory?!?! and large files for those LARGE sites that need all the data?

All I know right now is that the current system is NOT going to work.

(sorry, I've written the setinews for my team for almost 2 years, and I don't even have a clue where to start with the stuff I'm seeing now, and I know how many members of my team crunch simply because of the news (or so they tell me) )