Clean up mbeans that return Internal Cassandra types

Details

Description

We need to clean up wherever we return internal cassandra objects over jmx. Namely CompactionInfo objects as well as Tokens. There may be a few other examples.

This is bad for two reasons

1. You have to load the cassandra jar when querying these mbeans, which sucks.
2. Stuff breaks between versions when things are moved. For example, CASSANDRA-1610 moves the compaction related classes around. Any code querying those jmx mbeans in 0.8.0 is now broken in 0.8.2. (assuming those moves stay in the 0.8 branch)

For things like CompactionInfo we should just expose more mbean methods or serialize to something standard like json.

I'd like to target this for 0.8.2. Since we've already broken compatibility between 0.8.0 and 0.8.1, I'd say just fix this everywhere now.

I agree my main argument though is that we have a history of doing it unknowingly. Especially something like this where any update the the CompactionInfo requires updating the serialization version and then breaks compatibility.

I suppose I can live with this in 1.1 and a 'let's try really really really hard not to break compatibility in the 1.0.x releases.

Nick Bailey
added a comment - 12/Oct/11 03:01 I agree my main argument though is that we have a history of doing it unknowingly. Especially something like this where any update the the CompactionInfo requires updating the serialization version and then breaks compatibility.
I suppose I can live with this in 1.1 and a 'let's try really really really hard not to break compatibility in the 1.0.x releases.

Nick Bailey
added a comment - 11/Oct/11 20:52 Obviously I didn't get this done before the 1.0 freeze. And of course, this breaks between 0.8 and 1.0 since the serial version of CompactionInfo changes.
Can we target this for 1.0.1?

Nick Bailey
added a comment - 02/Sep/11 22:30 I'm probably going to take this and try and get it done quickly before the 1.0 freeze. I'd really like to solve this problem before 1.0 and then be strict on preventing this in future mbean methods.

Another thing that might be nice to fix here is the getNaturalEndpoints() method. It currently only takes a byte array or byteBuffer object which makes it impossible to call from something like jconsole. It would be nice to overload that with another method that takes a string as the key so you can call it from jconsole.

Nick Bailey
added a comment - 17/Jul/11 19:32 I'm not sure you need to add any constructors to Range. How about just an asPair() method or something similar that returns the tokens that make up the range converted to strings?
Everything else looks fine. getRangeToAddressMap() isn't needed because getRangeToEndpointMap() is exposed.
Another thing that might be nice to fix here is the getNaturalEndpoints() method. It currently only takes a byte array or byteBuffer object which makes it impossible to call from something like jconsole. It would be nice to overload that with another method that takes a string as the key so you can call it from jconsole.

Based on my research so far scanning the MBean's and their internal users (NodeProbe, NodeCmd and CliClient), there are 4 Cassandra-type dependencies: CompactionInfo, CompactionType, Token, Range. Addressing them individually and discussing my plan:

1. CompactionInfo/CompactionType
Now, CompactionInfo/CompactionType are manageable with a Map as suggested but Range and Token are a bit tightly coupled and more involved.

2. Range
Since Range already has the partitioner (either injected or implicit from StorageService), I believe I can add 2 new constructors that look like:
public Range(String left, String right)
public Range(String left, String right, IPartitioner partitioner)

and use the partioner.getTokenFactory().fromString() to curate the left and right Token's.

based on their usages, the Range in StorageService can be safely copied to something like a Pair/Tuple.

I noticed that the getRangeToAddressMap() is not exposed on the StorageServiceMBean interface - is that by design (not that I am complaining because right now, it is 1 less dependency to decouple but if it is an omission, I need to account for it)?

3. Token
I can change all MBean interfaces that need a Token to the corresponding String representation using partitioner.getTokenFactory().toString() and then reconstruct back using the fromString()

Gaurav Sharma
added a comment - 15/Jul/11 17:28 Based on my research so far scanning the MBean's and their internal users (NodeProbe, NodeCmd and CliClient), there are 4 Cassandra-type dependencies: CompactionInfo, CompactionType, Token, Range. Addressing them individually and discussing my plan:
1. CompactionInfo/CompactionType
Now, CompactionInfo/CompactionType are manageable with a Map as suggested but Range and Token are a bit tightly coupled and more involved.
2. Range
Since Range already has the partitioner (either injected or implicit from StorageService), I believe I can add 2 new constructors that look like:
public Range(String left, String right)
public Range(String left, String right, IPartitioner partitioner)
and use the partioner.getTokenFactory().fromString() to curate the left and right Token's.
Also, to replace the StorageServiceMBean's:
public Map<Range, List<String>> getRangeToEndpointMap(String keyspace);
public Map<Range, List<String>> getPendingRangeToEndpointMap(String keyspace);
based on their usages, the Range in StorageService can be safely copied to something like a Pair/Tuple.
I noticed that the getRangeToAddressMap() is not exposed on the StorageServiceMBean interface - is that by design (not that I am complaining because right now, it is 1 less dependency to decouple but if it is an omission, I need to account for it)?
3. Token
I can change all MBean interfaces that need a Token to the corresponding String representation using partitioner.getTokenFactory().toString() and then reconstruct back using the fromString()

Nick Bailey
added a comment - 07/Jul/11 17:23 Ed,
I'm not sure I see the advantage of a JMX Composite type as opposed to using something like a Map<String, Int> approach in any place that needs 'complex' types.

Edward Capriolo
added a comment - 05/Jul/11 18:53 CompositeData is a JMX type that holds a key value map of other JMX types and can be nested.
http://docs.jboss.org/jbossas/javadoc/4.0.1-sp1/jmx/javax/management/openmbean/CompositeDataSupport.html
This should allows us to return complex objects over JMX without having to resort to JSON serializing things.

Didn't know that existed. I guess it is specifically designed to address this kind of problem?

A quick look makes it seem overly complicated and verbose. If we want to avoid json we can still allow for more complicated types like Map<String> and whatnot. The main problem is just having to include the cassandra jar when querying jmx. From what I can tell it's basically impossible to do jmx-type stuff on a non-jvm language anyway, so maybe we don't really need json.

Nick Bailey
added a comment - 05/Jul/11 16:49 Didn't know that existed. I guess it is specifically designed to address this kind of problem?
A quick look makes it seem overly complicated and verbose. If we want to avoid json we can still allow for more complicated types like Map<String> and whatnot. The main problem is just having to include the cassandra jar when querying jmx. From what I can tell it's basically impossible to do jmx-type stuff on a non-jvm language anyway, so maybe we don't really need json.

Nick Bailey
added a comment - 05/Jul/11 14:37 Off the top of my head:
A CompactionInfo object is returned when getting compactions from the CompactionManager mbean.
Sub-types of Token objects are returned when calling getTokenToEndpointMap (StorageService mbean i think).
I'll see if i can't look around for any more.

Jonathan Ellis
added a comment - 03/Jul/11 00:20 If you can list which methods to clean up, this would be a good ticket for the surprisingly frequently asked, "What's a good place to get my feet wet in the Cassandra code?"