The first thing I notice is that it appears you're binding your servers to 0.0.0.0. Is that correct? If so, I recommend you change your configuration so that each server binds to its specific IP address or hostname. I believe binding to 0.0.0.0 is causing your cluster to malfunction because each node is broadcasting to the other node a connector which points to 0.0.0.0 which is meaningless to the other node. In essence each node is forming a cluster connection with itself.

As the documentation states, the <address> for a cluster-connection does not use wild-card matching so this cluster-connection is probably completely useless. Even if you removed the ".#" from the <address> I think it would cause problems because it would load-balance internal notification messages. Can you clarify why specifically you configured this cluster-connection?

The first thing I notice is that it appears you're binding your servers to 0.0.0.0. Is that correct? If so, I recommend you change your configuration so that each server binds to its specific IP address or hostname. I believe binding to 0.0.0.0 is causing your cluster to malfunction because each node is broadcasting to the other node a connector which points to 0.0.0.0 which is meaningless to the other node. In essence each node is forming a cluster connection with itself.

I see the INFO message that you specified in the log and it is as follows:

As the documentation states, the <address> for a cluster-connection does not use wild-card matching so this cluster-connection is probably completely useless. Even if you removed the ".#" from the <address> I think it would cause problems because it would load-balance internal notification messages. Can you clarify why specifically you configured this cluster-connection?

We took these configuration from JBOSS4 hornetQ configurations. Do you think this configuration is misleading the stuff? Do we need to remove that?

I can assist with some history here. We are porting our application from JBoss 4 / HornetQ 2.4.4 (grafted / hacked) to JBoss 8. As part of clustering problems with JB4 we reached out to the HornetQ team and were told to add this cluster connection (that I believe is not documented anywhere in the HornetQ docs). Your comment of "internal notifications" sounds familiar. In any case, one of our engineers asked the question of why our cluster was not working, and we were instructed to add this block. Kiran states, "we took this from our JB4 configuration", which is true. But we only added it because we were told to. We really don't understand why it was needed, or what it does.

We did finally get clustering working on JB4. But only for UDP. We have customers running clusters that cannot multicast between nodes, so as part of our port to JBoss 8, we're also attempting to make *everything* TCP, from discovery to message delivery. Hence "tcpping", and the special JGroups stack.

Our application is a homogenous cluster of MDB beans listening for messages that then call session beans, etc. as per accepted practices. We've worked to make the HornetQ RA / MDB session pool / thread pool counts the same to avoid funny behavior and thrashing. No errors in the logs, but no messages distributed to nodes other than our "primary" either.

So that's where we are. If that triggers any thoughts. I'd sure like to understand what the hornetq.# cluster configuration is for, and why we may have been told to put it in, and what affect might have if we take it out (better? worse?).

We are porting our application from JBoss 4 / HornetQ 2.4.4 (grafted / hacked) to JBoss 8. As part of clustering problems with JB4 we reached out to the HornetQ team and were told to add this cluster connection (that I believe is not documented anywhere in the HornetQ docs). Your comment of "internal notifications" sounds familiar. In any case, one of our engineers asked the question of why our cluster was not working, and we were instructed to add this block. Kiran states, "we took this from our JB4 configuration", which is true. But we only added it because we were told to. We really don't understand why it was needed, or what it does.

I'm a member of the HornetQ engineering team, and I still don't understand why you'd need that cluster connection. As far as I can tell it is meaningless now and always has been unless you had addresses that literally started with "hornetq.#" (e.g. "hornetq.#foo"). I'd be curious to see the original forum thread related to this configuration decision (assuming and hoping this exchange happened on the forum). As always, applying configuration without understanding what it does is dangerous.

We have customers running clusters that cannot multicast between nodes, so as part of our port to JBoss 8, we're also attempting to make *everything* TCP, from discovery to message delivery. Hence "tcpping", and the special JGroups stack.

As far as HornetQ clustering goes, you don't need to use JGroups to get TCP based discovery. You can simply use static discovery as discussed in the documentation. I think it's more straight-forward to understand and configure than JGroups.

At the end of the day I'd recommend starting with a working cluster configuration (e.g. one of the clustered configurations that ship with Wildfly) and work backwards from there. Don't add anything you don't absolutely need or understand. A homogeneous cluster is a basic use-case that should work out-of-the-box.

We investigated that very idea (static clustering), but we need to run in domain mode, not "clustered standalone" mode (for one basic reason: auto-deployment of our EAR application). And the example was a standalone.xml example. For this static connector definition to work in domain mode, instead of just having connections to all the *other* nodes, it would have to define connections to *all* nodes (which for each node would include "self"). This appears to be an anti-pattern, as described in the WildFly docs thusly:

3. Configure the appropriate Netty connector(s) - one for each of the other nodes in the cluster. [ah. there's the problem]

There is even this explicit statement:

Note: Do not configure a node with a connection to itself. This is considered as a misconfiguration.

So we're confused where to start. The example with static clustering is for standalone.xml. Domain mode requires a single, homogenous configuration file to go to all nodes, but the docs say configuring a static connection to "self" is bad. This is why we went to JGroups and TCP as the only alternative we could think of.

Given our domain-mode constraint, how would you suggest we proceed? Is there an example for "homogenous TCP-only domain-mode cluster?"

Thanks,

Mark

P.S. - I could probably dig up the conversation we had with your team. It was with yet another engineer. I was not directly part of that conversation. I'd also be interested to read it .