Suggestions from Benno Stein: final approach: screenshots + OCR (because displayed page and sourcecode will divert even more strongly in the future; extension: render only parts of the page (e.g. without pictures)