http://www.theregister.co.uk/week.html
I would like to have this indexed by the dates on the page so the table of contents would have the dates with the articles as sub-titles (much like the way the one for the Calgary Herald works). It would also be great if it would also include the links that go to reghardware.com. This is definitely beyond my script-fu.

These both would be greatly appreciated and save me (literally) days of futzing around trying to learn python.

append_page recursively looks for the next page tag ('div',attrs={'class':'toolbar_fat_next'}), gets the text and inserts it into the soup at the point where the tag was found until all pages have been inserted.

preprocess_html uses append_page to modify the html. You'll need to look for the next page tag on your site and adjust accordingly. This should get you started.

http://www.theregister.co.uk/week.html
I would like to have this indexed by the dates on the page so the table of contents would have the dates with the articles as sub-titles (much like the way the one for the Calgary Herald works). It would also be great if it would also include the links that go to reghardware.com. This is definitely beyond my script-fu.

This web page reproduces the RSS feed (at least for the first 3 feeds I checked.) Calibre has a builtin recipe for The Register RSS feed. Why don't you look at that one first to see if it meets your needs.

Anyone happen to have a good working recipe for fark.com ? I love reading the bizarre stories they post on there. thanks

Fark has an RSS feed, and I looked at it. It seems to have a one sentence description of an article on another site and a slew of comments. Do you just want the one sentence from Fark with the link, or do you want the comments? The content of the linked articles is probably too variable to easily add, as it comes from dozens of different sources, each with a different page structure. You'd get lots of junk with each one.