one of the big problem with tor is that you have to trust your exit node if the data cannot be encrypted on the complete path (ie website with no ssl), and even if the communication is authenticated and encrypted, you still can only have a medium level of confidence with your exit node.

i was thinking of a method to detect malicious exit node and i came up with this simple scheme

establish a fully trusted exitnode (ie you install it, in a secure location, with signed software, blah blah. usual level of paranoia) and a server configured to send a single long pseudorandom (seeded from something like your UA, or whatever you can strictly control) string, the SHA512 and Whirlpool of this hash, and the SHA512 / Whirlpool of the final produced HTML (bare bones, with just your plain text strings), both as public, and hidden service

then by forcing circuits, you get this page by, sequentially, your trusted node for comparison, node to check one, your trusted node, node to check two, etc... until you have passed through all the node in the network, checking the whole communication for strict identity, apart from the known variable at the packet level. this should ensure the communication is not tempered with along side the path. that's for active attacks (JS injection, flash, tracker pixel or whatever, or payload mangling). didn't think of a way to detect passive listening on the network, and frankly i don't think it is feasible.

as Tor doesn't use a pure random algorithm for circuits generation, you may have to force them to iterate through all the circuits possible.

If you're worried about injection, and the exit node was malicious I doubt that would help as they would probably only inject on defined criteria. (eg: switch out a login page, only track if you go to site X) So your check would go through fine, but next time you logged onto sla.ckers the exit node might replace our login page when it sees you requested it.

yeah, but i did thought this detection model more with a massive injection one (inserting malware, or whatever you want to do on an indiscriminative large scale) than targeted attacks.
but we could adapt this model as following :

you browse normally, (or a spider browse normally) with your proxy mirroring your request through your known good path.

as dynamic sites are just that, dynamic chances are that they will "look" the same (ie : change in content, in pretermined areas) but will not suffer of major change. the known good path would then serve as a reference sample, while the traffic would be fuzzy compared to this known good sample, looking for isomorphous patterns. if the "fingerprint" of the site is under a "change threshold" we let it through, else we :

1 : drop the request
2 :copy the whole packet-level session for static analysis
3a : mark the exit node as suspect for a blacklist ranking system
3b : if suspicious traffic is not an option, blacklist it for good