I'm trying to figure out a way to analyze code longevity in open source projects: that is, how long a specific line of code is active and in use.

My current thinking is that a line of code's lifespan begins when it is first committed, and ends when one of the following occurs:

It's edited or deleted,

Excluded from builds,

No code within its build is maintained for some period of time (say, a year).

NOTE: As clarification on why an "edit" is being counted as "death", edited lines would be counted as a "new" generation, or line of code. Also, unless there's an easy way to do this, there would be no accounting for the longevity of a lineage, or descent from an ancestor.

Again, thanks, so the only information I'm able to find of use is "We learn that 61% of the lines of code in today’s OpenBSD are foundational: they were introduced prior to the release of the initial version we studied and have not been altered since." - which while interesting, not really related. Everything else appears to focus on how long it takes vulnerabilities to be fixed, which again, interesting, but says nothing about factors to account for in code lifespan. Is there something I'm missing?
–
blundersJun 8 '11 at 21:55

+1 @Steven A. Lowe: Yes, I'd thought about that, thought how would you know if the code was being executed? Clearly if it's not being executed within a build it's dead. Do you it's not within the controlflow's execution path? If you mean for example how WindowsXP is not in active development, but still in active use, not sure how you'd know if the code was, or not in active use, maybe downloads, though that's still not real executions. Thanks for the input!
–
blundersJun 8 '11 at 17:12

The question of whether a given line of code can be or will be executed is equivalent to the Halting Problem, so there's no general algorithmic solution, which means it can't possibly be automated.
–
David ThornleyJun 8 '11 at 19:26

@David: modern compilers are quite good at this
–
Steven A. LoweJun 8 '11 at 19:34

@Steven A. Lowe: If you mean that they can find a lot of the code that isn't executed, sure. There's no way to find all of it unless you're willing to allow code that is executed to be misclassified, which usually isn't the case.
–
David ThornleyJun 8 '11 at 19:54

@David, @blunders: That is false. the .Net compiler will give you a warning for "Unreachable code detected". For example, if(false){foo();} will never execute foo().
–
Morgan HerlockerJun 8 '11 at 19:54