While looking at AOP in the last few days I noted that the pointcut notation allows target classes, methods and fields to be specified using annotations in place of class, method or field names. I wondered if there might not be a similarly useful feature that could be included in Byteman rules e.g.

RULE annotation example

CLASS @org.my.ClassTag

METHOD @org.my.MethodTag

HELPER org.my.TagHelper

IF TRUE

DO helpMe($0)

ENDRULE

Of course, this presents one or two issues. Here are some initial questions and possibel answers.

Firstly, what are the implications of making references to method parameters such as $0, $1 etc in the absence of a specific class or method?

These are not actually resolved at trigger injection time anyway. So, an inappropriate parameter reference would merely raise a type exception at first triggering and disable the rule e..g if helpMe expects an org.my.Foo and the rule is applied to a tagged class other than Foo or its subclasses then a type error ensues. There is nothign to stop method annotations being used in conjunction with a signature, thereby ensuring that triggering is only enabled for methods with the requisite matching parameter types e.g.

METHOD @org.my.MethodTag(Foo, Bar)

This helps but does not resolve the issue regarding $0. It might be possible to combine the CLASS annotation with a specific type for the target class e.g.

CLASS @org.my.ClassTag Foo

meaning only apply the rule to a class with name Foo if it is annotated. More usefully,

CLASS @org.my.ClassTag ^Foo

would apply to Foo or any of its subclasses. Note that with this extended notation the original bare annotation is simply an abbreviated version of the equivalent specification

CLASS @org.my.ClassTag ^Object

It is debatable whether the case where the ^ notation is omitted is really useful since in any given application the class Foo either will or will not be annotated. Perhaps, inlcusion of ^ should not be required and the unadorned class should be used as the maximal type for the classes considered for injection rather than an exact match i.e. we dro pthe first case and implement the second case without the need for the ^.

Secondly, what does this require the transformer and retransformer to do in order to be able to identify the relevant target classes/methods and ensure that the rule code gets injected where specified.

Using an annotation for the CLASS specifier requires checking every class as it is loaded which will make transformation slower in all cases -- but probably no worse than is already implied by using injection down class hierarchies. SImilarly, it will require checking all existing loaded classes for annotations during retransformation when a rule with a class annoation is first installed either during bootstrap or during dynamic load via the listener. This is probably a lot more work -- although this might be mitigated by using a cache of some sort.

Annotations on methods make this cost greater still by requiring a more thorough scan of class bytecode during load and requiring a larger cache and more complex cache index to track annotations already identified on loaded classes.

Is this complexity worth whatever gains this feature provides? I'm not yet convinced

Note there is also a chance to do a similar sort of thing with AT READ @Foo and @WRITE @Foo

As I sat down to start using byteman to implement some bytecode injection, in many cases targeting annotated classes and methods, I realized it was not supported yet, and google sent me here directly.

Speaking for myself only, this would be an incredibly useful bit of functionality. After all, there are a lot of annotation driven software stacks out there, and I think they might be feeling pretty low if they're not supported by byteman.....

And it's decently fast too. It's possible to serialize a reflections cache so that discovered items can be reloaded after JVM recycle, and presumably, one might actually validate that these cache loaded items are still valid when they're loaded.

Another option would be to introduce a property or language item to narruw the search by package name.

You can also narrow a search by classloader, although I am not sure that would be useful in this case.

Having said all that, I suppose I could always templatize my standard BTM scripts, then execute my own Reflections searches for annotations, and dynamically generate and install resolved scripts based on the results, but it would be nice to have something built in.

No, I am afraid I have not had time even to think about this. But if you are really interested in having this feature then I would be happy to see some prototype code and help integrate it into the Byteman code base :-)

Nicholas Whitehead wrote:

Speaking for myself only, this would be an incredibly useful bit of functionality. After all, there are a lot of annotation driven software stacks out there, and I think they might be feeling pretty low if they're not supported by byteman.....

Perhaps so. However, Byteman is not necessarily the only tool (let alone the best tool) for this job. Since annotations are generally applied to sets of classes they take Byteman into the realm of bulk transformation which is quite explicitly out of its comfort zone. Byteman rule sets are very much tuned towards transformations of a small, tightly specified collection of target methods. That's not just tuned in terms of performance -- although performance is a critical concern. It has also to do with usability.

The meaning of the code in the body of a rule is very much dependent on how it makes reference to the context established by the clauses which establish the trigger location -- the point(s) where a ruel gets injected. With a specific target class, method name (preferably including a signature) and AT XXX clause it is usually clear what computation each expression in the BIND, IF and DO clause encodes and hence what the rule is going to do. The less explicit these locating clauses become then the more ambiguous and subject to mismatch the body expressions may become.

So: omitting the package for a class or the signature for a method; specifying an interface rather than a class; using overriding injection; using an ALL count rather than an explicit count (or implicit count == 1); these all usually serve to render more and more fuzzy: the applicability of the rule at a potential trigger point; the types of values referenced in the rule; the specific members or methods which might be accessed/executed. Byteman can indeed resolve this fuzzinesss in each specific case but that does not really help a user trying to decide if a given rule will do what they want and expect.

Nicholas Whitehead wrote:

I do think that Issue#2 (finding instances of annotated classes and methods) could be quite challenging. I was thinking about how Reflections handles this:

And it's decently fast too. It's possible to serialize a reflections cache so that discovered items can be reloaded after JVM recycle, and presumably, one might actually validate that these cache loaded items are still valid when they're loaded.

Another option would be to introduce a property or language item to narruw the search by package name.

You can also narrow a search by classloader, although I am not sure that would be useful in this case.

Depending upon how the Reflections cache has been implemented using this approach may present a problem with garbage collection. Byteman must not hang on to classes which otherwise are no longer referenced or else they will not be garbage collected. Avoiding this horn of the GC dilemma usually impales you on the other horn of relying on weak references which has its own (unfortunate but unavoidable) costs.

Anyway, I don't actually think using class Reflections is needed to deal with this problem. The Byteman agent has access to the class base via the insrumentation APIs. So, when it starts up it is quite capable of searching this list and using reflection to locate annotated classes/methods. It would not need to maintain a table of annotated classes after that because from then on it gets a chance to see each new class as it is loaded. The problem is not how to locate the relevant classes but whether the costs of doing it (by whatever chosen method) are going to be noticeable. Using reflection on the loaded class might or might not be faster than scanning bytecode but whichever mechanism is quicker the cost of doing the check will almost certainly be noticeable because you have to do it at every class load after bootstrap. I know from measuring the costs of using interface rules and overriding rules that this will be significant and I would expect that using an offline AOP transformation would give better performance. Maybe it's not as bad as I fear or maybe its useable and useful -- I am only relying on my limited experience. If you can provide an implementation which disproves my hunches I'll be very pleased to see if we cna integrate it into Byteman.