Add method coverage and branch coverage metrics

Since the Coverage extension was introduced in Ruby 1.9, Ruby has had built-in line code coverage. Ruby should support more of the basic code coverage metrics [1]. I have a pull request on GitHub ( https://github.com/ruby/ruby/pull/511 ) to add Method Coverage (Function Coverage) and Branch Coverage. I'd love feedback to improve it.

which includes
* the current Ruby line coverage report,
* as well as a method report (The method defined on line 1 was called 2 times; the method on line 15 was called 0 times; ...),
* and a branch report (the branch on line 8 was called 2 times; the branch on line 11 was called 0 times).

Branches include the bodies of if, elsif, else, unless, and when statements, which are all tracked with this new feature. However, this feature is not aware of void bodies, for example:

if foo
:ok
end

will report that only one branch exists in the file. It would be better to declare that there is a branch body on line 2, and a void branch body on line 3, or perhaps line 1. This would require the keys of the [:branch] Hash to be something other than line numbers. Perhaps label_no? Perhaps nd_type(node) paired with line or label_no?

I was not very clear on the bit-arrays used in ruby.h, and just used values for the new macros that seemed to work.

Also, I would much rather use Ranges to identify a branch, so that a Coverage analyzer like SimpleCov won't need any kind of Ruby parser to identify and highlight a full chunk of code as a tested branch, or a not tested branch. I'm trying to find how that could be implemented...

I suggest using a coverage struct pointer in rb_iseq_struct to hold all
3 coverage VALUEs. rb_iseq_struct is gigantic already, so adding 2 more
VALUEs for infrequent coverage use can lead to even more bloat.

I completely rewrote and rebranded "Branch" coverage into "Decision" coverage after reading Steve Cornett's paper [1] closer. The resultant metric is much more robust, tracking void elses, and one-line if/else combos much better. Now every conditional statement, like:

:yes if 2+2 == 4

will have a count of truthy decisions, and falsey decisions to come out of the statement. This code here would register 1 truthy decision, and 0 falsey decisions, resulting in incomplete decision coverage.

Patch is attached. Also, I forgot to mention that this feature is well-tested, with the updated and new tests in test/coverage/

I took the same safety precautions that appeared around the new code. So, for example in iseq.c, I setup isec->coverage in the same place that I see iseq-> compile_data is set (lines 117 and 124). The other spot you might be talking about is thread.c, inside the new update_method_coverage() and update_decision_coverage(). I added more precaution to the update_decision_coverage() method, and could do the same to update_method_coverage().

Ah, I see. You're right; it's much better to allocate it only when tracking coverage. I've implemented that in a new commit [1]; I only had to add more checks to the _TRACE macros at the top of compile.c.

How about keeping Coverage.result the same, and allowing
Coverage.result2 or maybe Coverage.result(:all)?

I'm excited about this feature, but we should try to not break
existing tools.

Another thing I noticed with your latest fix:
You call rb_hash_lookup(coverages, path) 3 times more than you
need to. I don't know if there's a measurable speed difference,
but it's still ugly (and yeah, I just spent a fair amount of time
trying to eek out the last bit of hash lookup performance in #9425).

I'd actually like to keep the format of Coverage.result as the new format (Coverage.result values are each a Hash with :lines, :methods, and :decisions keys), rather than the existing Ruby 2.1.0 format, for two reasons:

1) Currently, the call to Coverage.result is very destructive: it immediately freezes the Coverage results, and clears the line coverage arrays. I don't think we can cleanly retrieve the line coverage, then maybe later the method coverage, always leaving the collective coverage results in an indeterminate "some cleared and some not cleared" state... and not really remember (or even let be discoverable) which results have been wiped and which haven't. [1]

2a) The very top of the Coverage documentation reads "Coverage provides coverage measurement feature for Ruby. This feature is experimental, so these APIs may be changed in future." so tools should be ready for a change.

2b) This change should be very easy to sniff out. For example, in SimpleCov, only one small change is needed in simple_cov/result.rb [2] in order to remain compatible with old Coverage and new:

Of course the decision isn't up to me, but multiple methods like Coverage.result2 or Coverage.result(metric = :lines) or something feels bad... maybe there is another solution.

I'll amend my code with fewer rb_hash_lookups. My code there is super ugly... your suggestion will be much better. And #9425 is looking very promising!

[1] footnote: I'm not a huge fan of the Coverage.result methodology... but I think it was written this way so that the returned Coverage hash of results is now eligible for garbage collection. I'd rather a method called maybe Coverage.end that ends coverage (and neatly pairs with Coverage.start) and returns results that it does today. AND that those results can again be retrieved later, at will, by calling Coverage.result. Those results would be valid and unchanged until Coverage.start is called again, at which point the results are cleared (rather than when Coverage.end is called). But this would be for another ticket, if its even worth opening soon.
[2] https://github.com/colszowka/simplecov/blob/master/lib/simplecov/result.rb#L26

Fair enough on the changes (I cannot make decisions on API changes).
We shall wait for others (mame?) to respond.

Yes, I'm the original author and current maintainer of ext/coverage.

I'm positive for Branch coverage itself! I don't care the API so much because it is a "backend" library that casual users should not use directly. However, I don't think that it is a good idea to break compatibility with no strong reason.

I also thank you for providing a patch. But sorry, at the moment, I can't afford the time to review it. For a time, could you address the problems that Eric pointed?

Eric, I could not recreate your fork failure (I'm on OS X 10.6, compiling with gcc 4.2.1...). However, I made update_method_coverage safer, thanks to your backtrace. I've updated the pull request with that fix, and fewer rb_hash_lookup calls:

Endoh-san and Eric, thanks both for considering these changes. I'm in favor of changing the format of Coverage.result because of the reasons I outline in above, but I can give more evidence that it should be a very safe change:

After a code search on GitHub [1], and after looking at the Code Metrics section of The Ruby Toolbox [2], it is clear that the Ruby community in general (at least open source) rely almost 100% on the SimpleCov gem [3]. Many of the other metrics tools found in the Ruby Toolbox use SimpleCov (coveralls, flog, rails_best_practices, and more) and never call Coverage.result manually. I can find no other libraries currently maintained that execute "Coverage.result".

And I still just don't like the idea of some Coverage results being duped/frozen/cleared, while others are not.

Another idea was suggested to me, where we introduce an optional parameter to Coverage.start, which defaults to :lines. Old libraries can continue to call Coverage.start, which will cause Coverage.result to return a Hash of Ruby files mapped to line counts. New libraries (or updated SimpleCov) can call Coverage.start(:all), which will cause Coverage.result to return the new, more complicated format.

This solution, however, makes it harder for SimpleCov to be compatible with old AND new format: the library must figure out whether Coverage.start takes an optional argument or not (or use something like RUBY_VERSION), track that, and parse Coverage.result differently based on the answer. I would again just prefer upgrading the format to this solution, allowing SimpleCov to just test coverage.is_a? Array.

Another design question: I think that block calls should probably also be considered part of "Method Coverage" (which would then be renamed to "Function Coverage", tracking both method and block coverage). One big design problem here is that it is common to chain several blocks in one line, like iterators. Should I put in the work to track block coverage now, or wait until after this gets merged?

In fact, I think we can virtually implement this feature by using a Ruby code parser, such as ripper.

For example, method coverage is usually identical to the execution count of the first line of each method.
(Of course, to make it precise, there might be many annoying cases, such as a method defined in one line.)

In similar way, you can measure decision coverage by parsing if/then/else and case/when statements.

If there is a great demand for this feature, I'm not against embedding it to the core. But, is it really needed?

Is it fully-clarified what type of visualization and analysis you want to do? Does the proposed API give you enough information for your use case?
If not, we will have to extend the API repeatedly, or even worse, the API will turn out not to be unusable after it is released.

For example, method coverage does not include method name. Decision coverage does not include lineno of "else" statement. Are they okay?

The current API is designed for an apparent use case: to visualize the execution count of each line.

By using ObjectSpace.each_object, a user can get a reference to these objects and destroy them. You should use RBASIC_CLEAR_CLASS to make them invisible for users. (But invisible objects may cause another problem. As I recall, rb_hash_lookup might not be used for an invisible object.)

clear_coverage deletes the coverage information measured so far. I think it also should delete method and decision coverage.
When Kernel#fork is called, this function is used in the child process, because the parent and the child has the same coverage information which may lead to duplicated measurement.

I think that Ripper is inadequate for these new metrics for the exact reason you mention: if/else code and methods that are all defined within one line, as well as implicit "else". Code like this of course exists everywhere, so a tool would be greatly inadequate if it could not give metrics regarding lines like "return if x.nil?" or "def foo; bar.baz; end".

On line 171, the report shows that the "if" branch is not taken, and this if/else is all one line! Line coverage shows that the line is executed, because at a minimum, the line is reached, and the condition is evaluated.

On line 522, the report shows that an implicit "else" branch is not taken. This is important because the line coverage looks fine, but the user is unaware, without Decision Coverage, that no test exercises a false condition in that "if." I think this is perhaps the most exciting and useful example of decision coverage.

I used the mail gem specs as a slightly longer performance test. Without Coverage, the specs take 10.5 seconds. The current Coverage library increases that by 14%. My proposed changes instead increase by 57% (6 seconds total). Ouch!

I used the jekyll gem specs as a much longer performance test. Without Coverage, the specs take 112 seconds. The current Coverage library increases that by 4%. My proposed changes instead increase by 11% (12 seconds total). Not too bad...

Maybe this proposal (with Hashes) isn't too bad. However, I wrote a patch to my proposed changes (changing the :methods and :decisions values to be Arrays instead of Hashes), which decreases the slowdown by some, but not much (instead of increasing mail specs by 57%, this patch only increases them 38%; 4 seconds).

Another possible solution is to allow the user to specify what should be tracked, with something like