In case anyone has missed it, you can now download compressed headers from our servers using Newsbin.
You should see a substantial speed increase anywhere from 5x to 15x.

Compressed header downloading is available in Newsbin version 5.50 B7 and above. It might be available since B6.
The option can be activated in the server options screen. Make sure the "Header Compression" box is checked.

..so I ended up deleting the a.b.m data folder and trying to 'download all'. It starts off with 49 million headers to download then after 15 million headers have downloaded it fails and restarts with 33million headers to go then promptly gets stuck repeating the same "No data after decompression" message.

If I cancel the failing/looping 'download all' with 33 million to go and do a 'download latest', it starts again from 49 million

If you don't run the most current beta's,you should. I assume you're running B7 which was the first version with compression. Steve brought this to my attention and I fixed it in B8. Pretty sure B8 has the fix anyway. My todo list doesn't track versions, just days.

Quade wrote:If you don't run the most current beta's,you should. I assume you're running B7 which was the first version with compression. Steve brought this to my attention and I fixed it in B8. Pretty sure B8 has the fix anyway. My todo list doesn't track versions, just days.

stevef anychance you could fix header compressions across all your server farms and give it traffic priority seeing as it's only 1 connection so users can actually download all the headers of a large top group without it timing out or dropping to low speeds for days?

There is no artificial priority difference between header and body downloads.

Any slow header download in the last 5 days can be attributed to a header server problem. This is no longer an issue.If you're having slow header downloads at this very instant, and it is reproducible, then send helpdesk a ticket for testing.

How many connections are you running? Lots of people are running too many connections and causing issues. Just for grins, I'd run it back to 10 connections and try again. I get crazy fast speeds to AW when everything is working.

Quade wrote:How many connections are you running? Lots of people are running too many connections and causing issues. Just for grins, I'd run it back to 10 connections and try again. I get crazy fast speeds to AW when everything is working.

So as users we can download binaries at 120mbps but it's perfectly acceptable to download headers at a fake 40mbps compressed speed for hours and days and get to about 10% and then it times out without completing?

You are comparing article downloads using multiple connections, whereas header download is using a single connection per group.Article download there is no server processing. Header download there is heavy parsing of the article to give you that single overview line.

I always update multiple groups at a time so, all my connections are used for header downloads. I've seen speeds close to 1 Gb/sec to AW. More typical is 500-600 Mbps. "Per connection" speed of 40 Mbps isn't really that bad. Giganews will barely do that, typically they're slower "per connection" than AW.

Newsbin has to work much harder downloading headers than downloading files. I wouldn't be surprised if Newsbin processing plays a role in that 40 Mbps per connection speed. I mean if nothing else it has to decompress a multi-mega-bit/sec data stream. When you use multiple connections, you get multiple cores doing the decompression.

If you just let Newsbin update your groups automatically once an hour, you'd always have mostly current headers too.

You can't download all the headers from boneless without it timing out out so we are paying for a service we can't use as we can't view the complete content.

Just restart the header download. Newsbin will continue where it left off. Just don't "download all headers" a second time because then you're starting over again and won't make any progress. Even if you completely exit Newsbin, restart and download headers again, it'll still download from where it left off till it gets all the headers.

I can only guess you "download all headers" again and throw away the headers you've already downloaded.

The problem you're having might just be you. Your PC or network. With Newsbin, it's easy to recover from a connections dropping out or disconnecting, just start the header download going again.

I don't understand this crazy idea of repeatedly compressing such a large database over and over again per customer.

There should be a system that auto prunes any headers from the server database that is older than 30 days and compresses them once into monthly header archives that are then served to the customer as a multi part binary files so they can max out there connection like any other file.

Sounds like a good idea.Now, how will you serve me the data when I request a range that is a fraction of the multipart, or a request that begins in the middle of one multipart and ends in the middle of another multipart? And remember, I don't want any extra data, because I've got a download cap.

Couldn't there be a system coded that runs in parallel so if they need a fraction of a multiplart they get switched back over to the current way of serving the headers for that header range until they are back in sync with the monthly compressed snapshots?

Or even a system for those that request all headers from a group to send a new server command so they are treated differently and use a compressed archive for most part to take the load off the server and give them max speed?

If you look at the pattern of requests, nearly ALL requests will be partial multiparts.And introducing new commands would require the co-operation of all USENET providers and NNTP readers. Just introducing compressed headers without breaking the existing NNTP specification was a headache in itself.

Could you do a virgin install of newsbin at the Astraweb network and download all the headers of say the top 10 or 20 groups and when it has finished leave the SPOOL_V6 folder from newsbin in the members area for us to download and copy to our own newsbin folder as that is a compressed database of the headers and all we need to do is then update the headers from when it was done.?

Most of the other groups are not a problem for getting all the headers so it would only need to be the top ones like boneless.

Cocha wrote:Could you do a virgin install of newsbin at the Astraweb network and download all the headers of say the top 10 or 20 groups and...

No, that's not going to be workable. Every time there's an update, you will all be downloading the entire SPOOL_V6 folder again. Plus the infrastructure at the moment has you downloading across a load balanced server farm. In your scenario, we would have to additionally make sure the web servers a load balanced just to serve out the SPOOL_V6 folder.