I'm working on a simple Ruby program that should count of the lines of text in a Java file that contain actual Java code. The line gets counted even if it has comments in it, so basically only lines that are just comments won't get counted.

I was thinking of using a regular expression to approach this problem. My program will just iterate line by line and compare it to a "regexp", like:

That makes sense. In that case, would it work if I had two separate regex, one of which could check if a comment spanned multiple lines and read additional lines accordingly?
–
gtorienSep 10 '13 at 22:19

2 Answers
2

Getting the count for "Lines of code" can be a little subjective. Should auto-generated stuff like imports and package name really count? A person usually didn't write it. Does a line with just a closing curly brace count? There's not really any executing logic on that line.

count = 0
file.each_line do |ln|
# Manage multiline and single line comments.
# Exclude single line if and only if there isn't code on that line
next if ln =~ %r{^\s*(//|/\*[^*]*\*/$|$)} or (ln =~ %r{/\*} .. ln =~ %r{\*/})
count += 1
end

There's only a problem with lines that have a multilines comment but also code, for example:

someCall(); /* Start comment
this a comment
even this
*/ thisShouldBeCounted();

EDIT
The following version is a bit more cumbersome but correctly count all cases.

count = 0
comment_start = false
file.each_line do |ln|
# Manage multiline and single line comments.
# Exclude single line if and only if there isn't code on that line
next if ln =~ %r{^\s*(//|/\*[^*]*\*/$|$)} or (ln =~ %r{^\s*/\*} .. ln =~ %r{\*/}) or (comment_start and not ln.include? '*/')
count += 1 unless comment_start and ln =~ %r{\*/\s*$}
comment_start = ln.include? '/*'
end