mnooning has asked for the
wisdom of the Perl Monks concerning the following question:

I need a way to get at the code of a subroutine. Not to execute it. Rather, to independently generate the checksum of the subroutine code. It thought it might be easy using a subs' code ref, but a coderef is only good for executing code, not for seeing the code itself. The end goal is to check each of the subs to tell if any subs have been tampered with by a hacker, independently of an overall file checksum.

Why would a finer grained inspection of the subroutines be any more suspect than any other part of the code? I would think, once the current state of the file has been blessed, that an overall checksum of the file would be more than sufficient to show that ANY changes have been made when NONE were expected. Once you have a corrupted file identified then you can use something like diff to see what changed.

Otherwise, there happens to be this dynamic language called Perl is that very good at parsing text. ;) Anything from a simple regex to Parse::RecDescent can be used to extract the bits of text that make up a Perl subroutine.

A hacker can replace needed bits in a file, then add other bits so that the overall file checksum stays valid. Doing that gets quantum if you have to hack the subroutines and file checksums.

As for RecDescent, if I could get at the text of a sub I could parse the sub and checksum it myself. The trick is to get at the text of the subroutine. That is where the question lies. Parse::RecDescent needs something like "$text", where $text is the text of the subroutine. You cannot hand it just a coderef. :-)

Use Digest::SHA to calculate a digest over the whole file. Although it is not impossible to make two totally different files with the same digest, it is extremely unlikely that both files will have the same length and both will be working programs. It is not as simple as changing a few instructions and adding a few meaningless bytes at the end to "make up" the checksum.

CountZero

A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

"A hacker can replace needed bits in a file, then add other bits so that the overall file checksum stays valid."

True, but it's very, very hard. And would be made exponentially harder -- effectively impossible -- by taking two checksums of the file using different algorithms. For example taking both a SHA-512 and Whirlpool hash of the file, then concatenating them.