I know that bandwidth conservation may not be an issue for other Boinc projects, but it is for Seti. At least right now.

I wonder if.....

With the increasing number of fast hosts and faster GPUs, cache sizes regardless of a user's settings are only going to increase.
Every time a host reports a completed WU or requests work, their entire cache list is sent along with the request. When cache sizes get into the 1000's of WUs, this can be a sizable transmission.

Would it be possible or practical for Boinc to transmit only information relating to what has changed in the list since the last host update, and not the entire list every time? And perhaps the entire list only every 10 completed requests to be sure everything stays in synch?

Would this be cumbersome to implement?
Would the bandwidth saved be worth the effort?
Just asking for trouble?

Just kitty thoughts.Happy is the person who shares their life with a cat. (Or two or three or........) =^.^=

I agree with this. Even when downloads/uploads are off, feeder is empty, etc, there is still a decent amount of traffic from just people trying to get new work. Perhaps there could be a quick comparison done between the host and the server of just the NUMBER of WU's on board, instead of sending the entire task list. Then, if there is a discrepancy (see: ghosts), the full file could be sent to see which files are missing. The server is already having to query the DB to compare task lists, surely it wouldn't be hard to just compare a number. I can't imagine it would add much if any extra load to the servers, and would surely save some bandwidth.

Then, it would just be:

Requesting new tasks, have 4250 on board.
ACK, 4250 on board, DB agrees.
Here's more tasks.

Or:

Requesting new tasks, have 4250 on board.
Error: should be 4275 on board.
Sending entire task list.
etc.

Would it be possible or practical for Boinc to transmit only information relating to what has changed in the list since the last host update, and not the entire list every time? And perhaps the entire list only every 10 completed requests to be sure everything stays in synch?

Would it be possible or practical for Boinc to transmit only information relating to what has changed in the list since the last host update, and not the entire list every time? And perhaps the entire list only every 10 completed requests to be sure everything stays in synch?

The server code is prepared to handle requests which don't have those lists of tasks on hand, some very old versions of the core client don't send them. So it's certainly possible to have a new core client which only sends the lists once in every 5 or 10 requests, etc.

Note that requests without the lists wouldn't be useful for the resend lost work capability, but maybe having only a fraction of requests triggering the comparison would reduce the database load it causes to a negligible amount.

However, another possibility to consider is having the request gzipped before sending. That would reduce the size considerably, and the method of sending a request is very similar to uploading a result (which already has an optional gzip feature). A 19236 byte request listing 37 in progress compresses to 3301 bytes, for instance.

The server code is prepared to handle requests which don't have those lists of tasks on hand, some very old versions of the core client don't send them. So it's certainly possible to have a new core client which only sends the lists once in every 5 or 10 requests, etc.

Note that requests without the lists wouldn't be useful for the resend lost work capability, but maybe having only a fraction of requests triggering the comparison would reduce the database load it causes to a negligible amount.

However, another possibility to consider is having the request gzipped before sending. That would reduce the size considerably, and the method of sending a request is very similar to uploading a result (which already has an optional gzip feature). A 19236 byte request listing 37 in progress compresses to 3301 bytes, for instance.

Joe

The way it would have to work would be something like:
The client requests work and reports the count of tasks. It would also have to have a flag that indicates that it can do this.
The server would see the flag, and notice a discrepancy in the count.
The server would pick some tasks that were already allocated to the client to send (if done with a reasonable amount of intelligence, this MIGHT avoid the next round trip).
The server would add a flag to indicate that the client is missing some tasks.
The client upon receiving the tasks sets a flag for the project noting that on the next connection that it would have to report the list.
The client would inspect the tasks that the server has suggested it can work on, and pitch any that it already has. The client would then determine if it had to request more work.

If the server receives a list rather than just a count, it behaves pretty much as it does today. However, if the work request does not ask for enough tasks to use all of the tasks that are already allocated to the client, the flag to report the list of tasks would be sent to the client. (Yes, this could cycle several times before either the tasks are sent to someone else, or are all delivered to the client).BOINC WIKI

Why not make the workunits bigger?
Every server at Seti would benefit from it.

I'll try again.

Why not make the workunits larger in datasize?
Every server at Seti would benefit from it.
Instead of about 300 Kbytes files per WU, increase it to let's say 1 Mbyte.
That will decrease the number of workunits by a third.
The network load for WU's surely must drop.

Why not make the workunits bigger?
Every server at Seti would benefit from it.

I'll try again.

Why not make the workunits larger in datasize?
Every server at Seti would benefit from it.
Instead of about 300 Kbytes files per WU, increase it to let's say 1 Mbyte.
That will decrease the number of workunits by a third.
The network load for WU's surely must drop.

That depends on the time required for the hosts to crunch the WU.
Simply increasing their size will not reduce server load if the crunch time per Kbyte of data remains the same.
AP is a good example...the WUs are much larger, but the time required to process them does not increase in proportion to their size, so they actually are harder on bandwidth than MB work. Of course, VHAR MB work with it's very quick processing times are even worse.

The only thing that would change the ratios is if the science application did more work on the data sent.

I am not sure how the new version of Seti being rolled out later this year compares in terms of processing time per WU sent.Happy is the person who shares their life with a cat. (Or two or three or........) =^.^=

I don't believe it's a bandwidth problem but that network requests are coming in too often to SETI due to the big amount of WU's. The solution to this is to make fewer WU's by making them larger. The crunching time on computers beside SETI has no relevance to this problem.

I don't believe it's a bandwidth problem but that network requests are coming in too often to SETI due to the big amount of WU's. The solution to this is to make fewer WU's by making them larger. The crunching time on computers beside SETI has no relevance to this problem.

Meeeooowwwrrr.

If you look at the cricket graph, you'll see that bandwidth is, indeed, one of the many limiting factors: we've been hard against the top peg for 23 hours solid now.

You are also correct about the number of network requests being a strain on the servers, but you would need to solve both halves of the problem (and probably many more besides) to make a significant differance.

For over 12 yrs the data size has been the same and after we have processed them the results are stored on the science database, with up to 30 items of interest for each result.

If the data size were increased, do we still only look for a max of thirty items of interest and throw away any other items found,
OR do you have some magic way of storing these larger tasks (twice size, 60 items. etc.) that would be compatible with all the older data.

(I noticed CPDN, also was slow and took several retries to DownLoad these (big)
WUs).
I also lowered my cache from 4 to 3 and 2 days, depending of the host.
When BOINC (6.12.33/34) asks for work and there is no work, a few minutes later
a new request is made and it takes some time, before it downloaded enough or
waits over an hour, to request again.

For over 12 yrs the data size has been the same and after we have processed them the results are stored on the science database, with up to 30 items of interest for each result.

If the data size were increased, do we still only look for a max of thirty items of interest and throw away any other items found,
OR do you have some magic way of storing these larger tasks (twice size, 60 items. etc.) that would be compatible with all the older data.

I have seen mention several times that they have considered processing MB work when AP tasks are downloaded. Since there are about 20 MB tasks in the same data that generates 1 AP task. I think it is just a matter of working out the logic on the back end for that.SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the BP6/VP6 User Group today!

that number is about 30 IIRC. I would suggest that since we already label tasks as VLAR why not simply increase the size 3-4 fold of just the VHAR WU's. This will cut down the number of requests for work and keep the GPU's crunching away on a single WU instead of starting and stopping every couple of minutesIn a rich man's house there is no place to spit but his face.
Diogenes Of Sinope

Is gigabitethernet2_3 only for SETI or it is for Berkley University and others?
To me it looks like they are sending 8 times more data then they are getting.
My contribution is getting 360 Kb and sending 30 kB per WU, that is i'm sending less data that i'm get.

Like you said it's indeed a bandwidth problem. To me it's seems there are other projects than SETI who maybe have to lookup their routines.

Is gigabitethernet2_3 only for SETI or it is for Berkley University and others?
To me it looks like they are sending 8 times more data then they are getting.
My contribution is getting 360 Kb and sending 30 kB per WU, that is i'm sending less data that i'm get.

Like you said it's indeed a bandwidth problem. To me it's seems there are other projects than SETI who maybe have to lookup their routines.

You have to remember that graph is inside-out....
The green is downloads going out.
The blue line is uploads and work requests going in.

Happy is the person who shares their life with a cat. (Or two or three or........) =^.^=