Thanks for reporting this Alexey. I think the right way to fix it would be to modify the JmxMonitoredMap. Right now, the unregister method checks if a given key is registered with the server and if so, unregisters it. On a core reload, the key is same but the bean instance is different so all keys are unregistered.

We can modify the SolrDynamicMBean and put the parent core's hashCode as an extra attribute. Then in the unregister method, remove the mbean from the server after checking if the mbean's hashCode attribute has the same value.

Shalin Shekhar Mangar
added a comment - 29/Jun/11 12:25 Thanks for reporting this Alexey. I think the right way to fix it would be to modify the JmxMonitoredMap. Right now, the unregister method checks if a given key is registered with the server and if so, unregisters it. On a core reload, the key is same but the bean instance is different so all keys are unregistered.
We can modify the SolrDynamicMBean and put the parent core's hashCode as an extra attribute. Then in the unregister method, remove the mbean from the server after checking if the mbean's hashCode attribute has the same value.

Alexey: at first glance, i think i would prefer Shalin's suggestion over your patch.

My main hesitation about your approach is the parameterized close method – If we really go that route i'd much rather see something like a "SolrCore.preCloseToReleaesResources()" method. But more fundementally, if we unregister the MBeans before creating the new core, there is a window of time when the old core is responding to requests, but can't be monitored (and if anything goes wrong with creating the new core, the old one will continue to handle requests indefinitely but be totally unmonitorable.

That said: i suspect the fix might even be easier then what Shalin proposed (which would require making SolrCore passing itself into the JmxMonitoredMap) ... can't we essentially change JmxMonitoredMap.unregsiter(String,SolrInfoMBean) to have psuedo code like this..

Hoss Man
added a comment - 30/Jun/11 23:32 Alexey: at first glance, i think i would prefer Shalin's suggestion over your patch.
My main hesitation about your approach is the parameterized close method – If we really go that route i'd much rather see something like a "SolrCore.preCloseToReleaesResources()" method. But more fundementally, if we unregister the MBeans before creating the new core, there is a window of time when the old core is responding to requests, but can't be monitored (and if anything goes wrong with creating the new core, the old one will continue to handle requests indefinitely but be totally unmonitorable.
That said: i suspect the fix might even be easier then what Shalin proposed (which would require making SolrCore passing itself into the JmxMonitoredMap) ... can't we essentially change JmxMonitoredMap.unregsiter(String,SolrInfoMBean) to have psuedo code like this..
if (server.isRegistered(name)) {
MBean existing = server.getMBean(name)
if (existing intsanceof SolrDynamicMBean &&
existing.getSolrInfoMBean() == this .get(name)) {
server.unregisterMBean(name);
} else {
// :NOOP: MBean is not ours
}
}
...adding a package protected SolrDynamicMBean.getSolrInfoMBean() seems less invasive then passing the SolrCore to another class

Hoss, I wish there was a way to do just that. I looked and looked but couldn't find it. The JMX API is really screwed up. Once you send in a MBean, apparently you can't get it out again. I'd be interested if anyone knew of a way to do that.

Shalin Shekhar Mangar
added a comment - 01/Jul/11 10:58 Hoss, I wish there was a way to do just that. I looked and looked but couldn't find it. The JMX API is really screwed up. Once you send in a MBean, apparently you can't get it out again. I'd be interested if anyone knew of a way to do that.

There's another bug with core reload that I found while running Alexey's test. Suppose there's only one core with name "X" and you reload "X", it then becomes registered with "" as the core name. So all your jmx monitoring is now useless because the key names have changed.

Shalin Shekhar Mangar
added a comment - 07/Jul/11 11:28 There's another bug with core reload that I found while running Alexey's test. Suppose there's only one core with name "X" and you reload "X", it then becomes registered with "" as the core name. So all your jmx monitoring is now useless because the key names have changed.

Shalin Shekhar Mangar
added a comment - 07/Jul/11 12:58 Here's a patch which fixes the issue. I've reused Alexey's tests with the solution I proposed earlier.
The problem with the core name changing across reloads is something we can address in another issue.

Shalin Shekhar Mangar
added a comment - 12/Jul/11 11:02 Patch updated to trunk.
I was mistaken about the core name changing on reload.
I'll commit this shortly.
However, regardless of what name you specify in the solr.xml, the default core's name is always an empty string. Is this intentional?

However, regardless of what name you specify in the solr.xml, the default core's name is always an empty string. Is this intentional?

Yes, I guess this is part of default core name normalization. I've noticed the same problem with JMX domain "solr/" (empty collection name) after core reload, but I believe I fixed that problem in my patch.

Alexey Serba
added a comment - 12/Jul/11 11:40 However, regardless of what name you specify in the solr.xml, the default core's name is always an empty string. Is this intentional?
Yes, I guess this is part of default core name normalization. I've noticed the same problem with JMX domain "solr/" (empty collection name) after core reload, but I believe I fixed that problem in my patch.

I've noticed the same problem with JMX domain "solr/" (empty collection name) after core reload, but I believe I fixed that problem in my patch.

I removed that part from your patch because I was not sure of its consequences. Also on reload, it was going back to an empty string. I used to think that in 4.0, the default core would be registered with both the original name as well as empty string but it is not so. We should fix it but that belongs to a separate issue.

Shalin Shekhar Mangar
added a comment - 12/Jul/11 12:00 I've noticed the same problem with JMX domain "solr/" (empty collection name) after core reload, but I believe I fixed that problem in my patch.
I removed that part from your patch because I was not sure of its consequences. Also on reload, it was going back to an empty string. I used to think that in 4.0, the default core would be registered with both the original name as well as empty string but it is not so. We should fix it but that belongs to a separate issue.

Mark Miller
added a comment - 12/Jul/11 12:36 Is this intentional?
Its how noble and I first set things up if I remember right - I've wanted to address it since as it has a few not nice side affects - have never gotten around to it though.

Unlike the Ant build, the Maven build copies resources over to solr/build/, and for the Solr core+solrj tests (they run together under Maven), this includes core/src/test-files/, test-framework/src/test-files/, and solrj/src/test-files/, so the environment is somewhat different than under Ant.

I took a quick look at the code and I don't understand what's happening - do you? The file that can't be found solr/solr/conf/solrconfig.xml is instead located at solr/conf/solrconfig.xml, so it looks like an extra solr/ is being prepended for some reason?

Steve Rowe
added a comment - 15/Jul/11 14:23 Hi Shalin,
The new test you added has never succeeded under Maven - here's the first failure:
https://builds.apache.org/job/Lucene-Solr-Maven-3.x/178/testReport/junit/org.apache.solr.core/TestJmxIntegration/testJmxOnCoreReload/
Unlike the Ant build, the Maven build copies resources over to solr/build/, and for the Solr core+solrj tests (they run together under Maven), this includes core/src/test-files/ , test-framework/src/test-files/ , and solrj/src/test-files/ , so the environment is somewhat different than under Ant.
I took a quick look at the code and I don't understand what's happening - do you? The file that can't be found solr/solr/conf/solrconfig.xml is instead located at solr/conf/solrconfig.xml , so it looks like an extra solr/ is being prepended for some reason?

Steve Rowe
added a comment - 15/Jul/11 14:26 - edited The new test you added has never succeeded under Maven
That's incorrect - the new test has never succeeded under Maven on branch_3x . On trunk, it succeeds. Which makes this failure even stranger...

I can cause this to fail under IntelliJ if I set the working directory to the location where all the resources are copied to. That is, if SolrTestCaseJ4.getFile() finds solr/conf/ in the current directory, solr home is set to ./solr/solr/.

Steve Rowe
added a comment - 16/Jul/11 21:57 the new test has never succeeded under Maven on branch_3x. On trunk, it succeeds.
It failed today on the Jenkins trunk Maven build, so I guess it's not so strange: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/189/testReport/junit/org.apache.solr.core/TestJmxIntegration/testJmxOnCoreReload/
I can cause this to fail under IntelliJ if I set the working directory to the location where all the resources are copied to. That is, if SolrTestCaseJ4.getFile() finds solr/conf/ in the current directory, solr home is set to ./solr/solr/ .

Fixed (and simplified) test to compare registered mbeans with info registry size. Comparing number of mbeans between core reloads is flawed because when tests run in parallel, mbeans from multiple tests can be registered in the same mbean server.

Shalin Shekhar Mangar
added a comment - 20/Jul/11 10:43 Fixed SolrDynamicMBean.getAttribute to support reading coreHashCode.
Fixed (and simplified) test to compare registered mbeans with info registry size. Comparing number of mbeans between core reloads is flawed because when tests run in parallel, mbeans from multiple tests can be registered in the same mbean server.