Feature #2565

adding hooks for better tracing

I made a commit that embeded dtrace probes into Ruby so that you can
profile a Ruby application at runtime. (r26235)

Adding probes had been approved by a Ruby developer's meeting,
however, the commit was little larger than what other committers
expected. I got some objection for the commit.
In the end, I decided to temporarily revert the commit. (r26243)

I discussed how we should support dynamic runtime tracing, with ko1,
mame, naruse, unak and shyouhei. The problems of the commit were:
* the probes duplicated with the event_hook framework
(rb_add_event_hook, Kernel#set_trace_func)
* Design of the probes were not verified enough.
* more trial and error are necessary, to make it clear what is
necessary to trace a Ruby application.

I accepted ko1's suggestion:
* reverting the commit
* adding some hooks for rb_add_event_hook().
* implementing probes for dynamic runtime tracing on the event_hook framework.
* these probes can be implemented as a gem
* I will aget a chance for trial and error.
* The probes possibly will be merged into Ruby itself after enough
designed and getting enough use cases.

Here is a patch to add the hooks I and ko1 talked about. (attached)
And here is an extension library that provides prove points to dtrace,
on top of the hooks. (http://github.com/yugui/vm_probes )

History

All the event types you added are C-level events which set_trace_func can
not monitor. But I think that RUBY_EVENT_RESCUE may be Ruby-level event
because it is a language-level event like RUBY_EVENT_RAISE.

In addition, the modification of thread_reset_event_flags is just bug fix.
So, in advance, the part should be committed as another commit.

Hi, I am interested in this functionality too. I'm interested in this because I have an application where object creation has increased. The increased objects are hash, array, and string literals. The existing trace methods will not be triggered on allocation of those objects, and the scripting / filtering / speed of dtrace is useful to me.

I've been patching trunk with the attached patch in order to debug my current issues. I don't think it's as complete as yugui's commit though.

I cant figure out for the life of me why this issue is not getting more traction. We're seeing more and more environments getting dtrace (nodejs now has dtrace probes) and ruby has removed them, seemingly never to return. We've got people willing to help (Aaron Patterson!!), help us help Ruby. Please?

Mark's concern, why Ruby doesn't include DTrace support, is reasonable, I explain why.
It is because tracing is so important that we can't implement it without careful design.
But we know such too careful way annoys you.
So we are considering that 2.0 (not 1.9.3) includes experimental tracing support.

I use this branch on a daily basis with no noticeable performance penalties, but I have a few tasks remaining before I want to call my work "done".

Probe stability declarations

One concern that ko1 had was people relying on DTrace probe APIs. DTrace allows us to declare the stability of any particular probe. This allows us to tell DTrace consumers how much they should rely on that particular API. I like the API of the probes I've added (and the Joyant probes), so I don't want to change them. However, it may be useful to declare them as unstable until we've had multiple ruby releases that contain the probes (and the community is happy with the API).

Tests

I've started writing tests for the probes, but there is a challenge. DTrace can only be enabled by a user with elevated permissions, so the tests must be run with sudo. It's easy enough to write tests that will only execute when the user has the correct permissions, but I'm not sure what we should do about CI.

Autotools generating probes.h

I don't understand autotools very well, so I need help with this task. At rubyconf, nobu helped me patch Makefile.in to generate the probes.h file, but it doesn't seem to automatically generate on my system. I'm not entirely sure what the problem is.

I've attached a patch that adds dtrace probes to trunk. If nobody objects, I will apply.

The patch doesn't contain every probe I want, but I think it's in a good place to merge to trunk. I can add more probes later. :)

I don't make any objection.

Let us clear. Is that specification? I mean, on 2.1 and later, "should
keep same dtrace interface support?" If we need keep interface, we need
review carefully.

DTrace allows us to specify the stability of the probes. I've declared
the provider name of "ruby" to be stable. We don't declare any modules
or functions, so I've declared them as stable. The probes (e.g.
function-entry), as well as the type and number of arguments to the
probes are declared as unstable, so users are advised not to depend on
them.

I have a question: We have lazy sweep which run many short scattered
sweep process. Should we measure such a thing?

I think it's fine to measure. I've added probes in all the places where
we gather GC statistics (near GC_PROF_SWEEP_TIMER_START and
GC_PROF_SWEEP_TIMER_STOP). The nice thing is that the DTrace system
will provide the time, so we don't need to write any C code to calculate
timing.

I think declaring them unstable is the best conservative approach.
If we find them to be good over the long term, we can change the
stability declaration in later releases of Ruby.

I was just looking at Joyent's ruby dtrace page and saw ruby-probe (Probe that can be fired from ruby code). However, I didn't see it in Aaron's implementation here https://github.com/tenderlove/ruby/blob/probes/probes.d. Is there a reason why that's not included? Please enlighten. Thanks.

I was just looking at Joyent's ruby dtrace page and saw ruby-probe (Probe that can be fired from ruby code). However, I didn't see it in Aaron's implementation here https://github.com/tenderlove/ruby/blob/probes/probes.d. Is there a reason why that's not included? Please enlighten. Thanks.

We shouldn't add it today because we don't know the license of the
Joyant patches, and someone must be the maintainer. A bigger reason is
that this is a feature that can be added via a gem download. Adding
interpreter probes cannot be done via gem download.

Maybe we will expose dtrace through ruby in the stdlib someday, but I don't
think we should today.

I don't feel very comfortable to see dummy_probes.h as part of the patch while probes.h is generated by dtrace during build. Either both files should be already in patch, or both files should be created during compilation. I prefer the latter.

AFAIK, the only open issue (that Vit raised ) is that
we shouldn't check in the dummy probe header file. I agree with Vit, so
I want to update my patch before merge. :-)

Well, I proposed two options, either have both versions in SCM or keep it out of the SCM. I'm still in favor of the latter, however I realized later that there is one major disadvantage. If we want to avoid the need of Ruby during build from tarball, as it is possible now, then the release manager has to have available either DTrace or SystemTap on his/her system. Not sure if that might be fulfilled :/ If not, then both files should be pre-generated by its maintainer and stored in SCM.

In other words, we should think also about associated work-flow and release management.

AFAIK, the only open issue (that Vit raised ) is that
we shouldn't check in the dummy probe header file. I agree with Vit, so
I want to update my patch before merge. :-)

Well, I proposed two options, either have both versions in SCM or keep it out of the SCM. I'm still in favor of the latter, however I realized later that there is one major disadvantage. If we want to avoid the need of Ruby during build from tarball, as it is possible now, then the release manager has to have available either DTrace or SystemTap on his/her system. Not sure if that might be fulfilled :/ If not, then both files should be pre-generated by its maintainer and stored in SCM.

In other words, we should think also about associated work-flow and release management.

I don't think the header file should be generated when the tarball is
being built, but when the user installs Ruby (this is how PostgreSQL
does their dtrace headers).

I'm working on a sed script to convert the .d file to a .h file in the
case that the user doesn't have dtrace on their system. Then the
release manager doesn't need to have Ruby or DTrace on their system when
making the tar.

As I said, there are several conditions which should be fulfilled. If you don't want to

1) store the generated files in SCM
2) generate the scripts during preparation of tarball by release manager
3) want to have Ruby available during build from tarball

than you don't have other option than to use something (in this case sed) what is probably available on your system during the build time. So although I would prefer Ruby, it does not fulfill the 3 conditions above.

I've attached a new patch that contains a sed script for converting the .d file to a .h file.

When a user tries to install Ruby, if they do not have dtrace, the sed script will convert the .d file to a dummy header file. The dummy header file will be included, and the compiler will remove all dtrace probes.

I have created a wiki page to describe the DTrace probes. It includes the probe names, arguments, stability, and a short description for each probe:

As I said, there are several conditions which should be fulfilled. If you don't want to

1) store the generated files in SCM
2) generate the scripts during preparation of tarball by release manager
3) want to have Ruby available during build from tarball

than you don't have other option than to use something (in this case sed) what is probably available on your system during the build time. So although I would prefer Ruby, it does not fulfill the 3 conditions above.

ruby:::line(filename, lineno);
Your patch depends on the trace' instruction. I plan to removetrace'
instruction on default (if I can implement it). It will conflicts with
your proposed patch.
Or I shouldn't make such optimizations?

One approach is add dtrace mode on command line.
If the option is enabled, the process will be all dtrace probe ready.
But a bit slow (~10%).
Without this option, the process will be not all dtrace probe ready.
This option can be enabled in running with implementation effort.

(but I can't understand sed script ; I prefer ruby script :)
As I said, there are several conditions which should be fulfilled. If you don't want to

1) store the generated files in SCM
2) generate the scripts during preparation of tarball by release manager
3) want to have Ruby available during build from tarball

than you don't have other option than to use something (in this case sed) what is probably available on your system during the build time. So although I would prefer Ruby, it does not fulfill the 3 conditions above.
I don't check what dtrace needs correctly, but is not enough?:

As I said, there are several conditions which should be fulfilled. If you don't want to

1) store the generated files in SCM
2) generate the scripts during preparation of tarball by release manager
3) want to have Ruby available during build from tarball

than you don't have other option than to use something (in this case sed) what is probably available on your system during the build time. So although I would prefer Ruby, it does not fulfill the 3 conditions above.

We could do this, but it means that every time we change probes.d, we
would have to generate a new dummy header file and check it in. I think
the purpose of the sed script is to eliminate the "generate new dummy
header file and check it in" step.

Oops, yes. I've updated the probes. I removed line and added some probes around tracing instruction sequences. It's basically the same information that the vm_collect_usage functions collect, but you don't need to recompile ruby to collect that information.