ivory has asked for the
wisdom of the Perl Monks concerning the following question:

Hey all,

I have been given several huge modules written by a coworker that have no documentation, and been asked to familiarize myself with them and provide some sort of documentation. I want to start with generating a list of 1. all the functions in each module, 2. the variable names used, and 3. the other functions each one calls (plus the imput parameters). So far I have a script that does the first two, but the thrid is proving to be much more challenging. After I have this, I will be able to finish the rest of the documentation by hand.

This may seem like a silly thing to want to do, but the group of modules is over 300 pages long! If any of you could point me in the right direction I would greatly appreciate it!

I actually ran into a similar need a little while ago, when I joined a project that had similar problems, only in php. Here is the program I wrote, adapted (poorly) for perl files. It's pretty apparent you'll never be able to accurately find all varibles and subs due to perl's free and easy syntax. The story follows below.

I joined this project that my friends (bad move number one) were working on. When they sat down to show me the code the familiar 'cold feeling of dread' swept over me. There were no comments.

"But you must have some comments in the code!" I exclaimed in a horrified voice.

"Oh yes, we do. Look, just before each function declaration, there's a comment" (twitch).

"And what about the variables?"

"Oh, they're self explanatory" (whimper).

I returned the next day armed with the program I linked to above, and ran it over their code. It revealed, amoung other horrors, that they were declaring the same function in different scripts (which worked together to run the website).

"Guys, why do you declare the same function in different parts of the program?"

"Well, when we were doing the new parts, we just cut and pasted the old functions, cause it was easier than worrying about making function libraries" (blackout) "but it's OK, when we change one we go through and change the rest" (suppresed scream) "How did you find out about the functions so fast?".

"Oh, I wrote this program to go through your source and pull out all the functions and variables"

"Oh excellent! Now we don't need a code map, we can just use your program!" ---

Sorry to put you all through that, I just had to get that off my chest. It's actually the abridged conversation, since he spent half an hour trying to convince me that you didn't need documentation if you used descriptive names for the functions and variables, while I got more and more 'worked up'. This rant probably gives the wrong impression too, because he is my friend and I do like him, but I could find no way to convince him that documentation might be a good idea. I eventually volunteered for webmaster and spent the next three months updating the library's closing hours.

I suppose the real joke is that that was his first job after graduating as a COMPUTER SCIENCE MAJOR.

____________________
Jeremy
I didn't believe in evil until I dated it.

Actually most of what you are putting them down for is
actually good advice. I likewise use few comments, use
descriptive variable names, comment each function, etc.
If you are relying on understanding comments rather than
figuring out the code, you will have serious problems
when the two disagree. And they will come to disagree
over time - a general principle is that whenever two
things need to stay in sync (eg comments and code, code in
two places, etc) they inevitably will tend to not do so
perfectly.

The real problem sounds like the use of global
variables and a cut-and-paste methodology...

For the record the single most miserable piece of code that
I have had the misfortune to meet was also one of the
most straightforward and heavily commented. It was utterly
impenetrable for the same reason that legalese is so hard
to read.

I agree that all those things are good practise, but I feel there is more to it. A code map along the lines of "Well, we're going to have a main routine which fires up different function libraries depending on the cgi params, each case will be handled like this:....". You know, a general layout including future directions for the coders to go in. You need something like that when you have three people working on the same code.

And yes, their main problem was that they didn't understand OO in the slightest, and were using global variables rather than presenting methods/data in a useful, consistant way. I suspect if they had sat down and written out what they were trying to do, they would have started to spot ways to improve their code. Perhaps that's program design versus programming. The code had apparently grown out of a 'quick hack' that someone put together one day, so it was kinda lumpy and directionless.

In my mind good documentation is an art as much as coding or writing is. It isn't necessarily explaining what every line does. Having said that I'm off to edit my code snippit before somebody reads all my posts and plays spot the hippocritic bastard :)

____________________
Jeremy
I didn't believe in evil until I dated it.

Actually most of what you are putting them down for is actually good advice. I likewise use few comments, use descriptive variable names, comment each function, etc. If you are relying on understanding comments rather than figuring out the code, you will have serious problems when the two disagree.

This is only true when you fall into the common trap of believing that comments are about explaining what you're doing in the code.

Comments shouldn't be doing that. The code should be easy enough to follow that it's pretty clear what it's doing.

Comments should be explaining WHY you're doing
what you're doing. That's the bit that's hardest for someone to reconstruct after the event:

"Yes, I can see that this loop is stepping through each item, building a hash of hashes of their info keyed by concatenating the title and the ID ... but WHY?"

Once you realise it's so they can easily sort by title at the end, but that keying the hash on title wasn't enough as not every product had a unique title, then this makes sense. A well written comment would have made this process a lot easier, however.

Comments are there mainly for the benefit of anyone who has to maintain your code later (including yourself!). They should explain all the bits where people will go "So, why exactly are they doing this like this?"

The thing about this particular project is that the company I work for is worried that the guy who wrote it might quit...and then there is no one who knows anything about it at all :) Nice, right? Bur the author didn't think anyone would have to read the code, so there are no comments, and all the variable names are $a and $b. :)

When putting a smiley right before a closing parenthesis, do you:

Use two parentheses: (Like this: :) )
Use one parenthesis: (Like this: :)
Reverse direction of the smiley: (Like this: (: )
Use angle/square brackets instead of parentheses
Use C-style commenting to set the smiley off from the closing parenthesis
Make the smiley a dunce: (:>
I disapprove of emoticons
Other