Hi,Has anybody tried to run multiple RegionServers on a single physicalnode? Are there deep technical issues or minor impediments that wouldhinder this?

We are trying to do this because we are facing a lot of GC pauses on thelarge heap sizes (~70G) that we are using, which leads to a lot of timeoutsin our latency critical application. More processes with smaller heapswould help in mitigating this issue.

On 12/03/2012 04:39 PM, Ishan Chhabra wrote:> Hi,> Has anybody tried to run multiple RegionServers on a single physical> node? Are there deep technical issues or minor impediments that would> hinder this?Can you provide more information about your setup?- Network- Disk schema- RAM>> We are trying to do this because we are facing a lot of GC pauses on the> large heap sizes (~70G) that we are using, which leads to a lot of timeouts> in our latency critical application. More processes with smaller heaps> would help in mitigating this issue.Have you read this, Ishan?http://www.cloudera.com/blog/2011/04/hbase-dos-and-donts/

Not tried multi-RS on a single node, but have you looked at the off-heapcache? It's a part of 0.92.x. From what I understand that feature wasdesigned with this case in mind (I.e., trying to do a lot of caching, butdon't want to introduce GC issues in RS).

>Hi,>Has anybody tried to run multiple RegionServers on a single physical>node? Are there deep technical issues or minor impediments that would>hinder this?>>We are trying to do this because we are facing a lot of GC pauses on the>large heap sizes (~70G) that we are using, which leads to a lot of>timeouts>in our latency critical application. More processes with smaller heaps>would help in mitigating this issue.>>Any experience or thoughts on this would help.>Thanks!>>-- >*Ishan Chhabra *| Rocket Scientist | Rocketfuel Inc. | *m *650 556 6803

I have a very small cluster where all nodes are identical. However, I wasjust given a very powerful node to add into this cluster which effectivelydoubles the total CPUs, RAM, and HDDs in the cluster.

As such, when I run a MR job half the jobs go to this single, new node yetmost of the data is not local due to HBase balancing the regions.

Does it make sense for me to run multi-RS on this node?On Mon, Dec 3, 2012 at 3:39 PM, Ishan Chhabra <[EMAIL PROTECTED]>wrote:

> Hi,> Has anybody tried to run multiple RegionServers on a single physical> node? Are there deep technical issues or minor impediments that would> hinder this?>> We are trying to do this because we are facing a lot of GC pauses on the> large heap sizes (~70G) that we are using, which leads to a lot of timeouts> in our latency critical application. More processes with smaller heaps> would help in mitigating this issue.>> Any experience or thoughts on this would help.> Thanks!>> --> *Ishan Chhabra *| Rocket Scientist | Rocketfuel Inc. | *m *650 556 6803>

I would like to add 2 cents from my side. Even if we have superiorCPU, RAM and Disk, the IO still remains the bottleneck. The CPU would neverhave much impact on the overall performance, no matter how powerful it is,if there is no considerable evolution of IO. Also, if we have multiple RSs,our DN and TT may face memory issues. What do you guys say?

> I too am interested in running multiple RS on a single node.>> I have a very small cluster where all nodes are identical. However, I was> just given a very powerful node to add into this cluster which effectively> doubles the total CPUs, RAM, and HDDs in the cluster.>> As such, when I run a MR job half the jobs go to this single, new node yet> most of the data is not local due to HBase balancing the regions.>> Does it make sense for me to run multi-RS on this node?>>> On Mon, Dec 3, 2012 at 3:39 PM, Ishan Chhabra <[EMAIL PROTECTED]> >wrote:>> > Hi,> > Has anybody tried to run multiple RegionServers on a single physical> > node? Are there deep technical issues or minor impediments that would> > hinder this?> >> > We are trying to do this because we are facing a lot of GC pauses on the> > large heap sizes (~70G) that we are using, which leads to a lot of> timeouts> > in our latency critical application. More processes with smaller heaps> > would help in mitigating this issue.> >> > Any experience or thoughts on this would help.> > Thanks!> >> > --> > *Ishan Chhabra *| Rocket Scientist | Rocketfuel Inc. | *m *650 556 6803> >>>>> -->> Robert Dyer> [EMAIL PROTECTED]>

Emm, have you tried to tune your GC deeply? please provide the exactly VM options and jdk version and GC logs..In our test cluster this week, i managed to reduce the longest STW from 22+ seconds(Xmx20G) to 1.1s(Xmx48G) under a very heavy YCSB stress long-term-testing.

Also it would be better to ask help from hotspot-gc-use/hotspot-gc-dev mail list:)And the G1GC within jdk7u4+ is a potential solution for large-heap senario as well:)________________________________________> On Mon, Dec 3, 2012 at 3:39 PM, Ishan Chhabra <[EMAIL PROTECTED]> >wrote:>> > Hi,> > Has anybody tried to run multiple RegionServers on a single physical> > node? Are there deep technical issues or minor impediments that would> > hinder this?> >> > We are trying to do this because we are facing a lot of GC pauses on the> > large heap sizes (~70G) that we are using, which leads to a lot of> timeouts> > in our latency critical application. More processes with smaller heaps> > would help in mitigating this issue.> >> > Any experience or thoughts on this would help.> > Thanks!> >> > --> > *Ishan Chhabra *| Rocket Scientist | Rocketfuel Inc. | *m *650 556 6803> >>>>> -->> Robert Dyer> [EMAIL PROTECTED]>

> Emm, have you tried to tune your GC deeply? please provide the exactly VM> options and jdk version and GC logs..> In our test cluster this week, i managed to reduce the longest STW from> 22+ seconds(Xmx20G) to 1.1s(Xmx48G) under a very heavy YCSB stress> long-term-testing.>

Do you have any further explanation on your specific case ? Looksinteresting :-)>> Also it would be better to ask help from hotspot-gc-use/hotspot-gc-dev> mail list:)> And the G1GC within jdk7u4+ is a potential solution for large-heap senario> as well:)> ________________________________________> > On Mon, Dec 3, 2012 at 3:39 PM, Ishan Chhabra <[EMAIL PROTECTED]> > >wrote:> >> > > Hi,> > > Has anybody tried to run multiple RegionServers on a single physical> > > node? Are there deep technical issues or minor impediments that would> > > hinder this?> > >> > > We are trying to do this because we are facing a lot of GC pauses on> the> > > large heap sizes (~70G) that we are using, which leads to a lot of> > timeouts> > > in our latency critical application. More processes with smaller heaps> > > would help in mitigating this issue.> > >> > > Any experience or thoughts on this would help.> > > Thanks!> > >> > > --> > > *Ishan Chhabra *| Rocket Scientist | Rocketfuel Inc. | *m *650 556> 6803> > >> >> >> >> > --> >> > Robert Dyer> > [EMAIL PROTECTED]> >>

> Emm, have you tried to tune your GC deeply? please provide the exactly VM> options and jdk version and GC logs..> In our test cluster this week, i managed to reduce the longest STW from> 22+ seconds(Xmx20G) to 1.1s(Xmx48G) under a very heavy YCSB stress> long-term-testing.>

Do you have any further explanation on your specific case ? Looksinteresting :-)>> Also it would be better to ask help from hotspot-gc-use/hotspot-gc-dev> mail list:)> And the G1GC within jdk7u4+ is a potential solution for large-heap senario> as well:)> ________________________________________> > On Mon, Dec 3, 2012 at 3:39 PM, Ishan Chhabra <[EMAIL PROTECTED]> > >wrote:> >> > > Hi,> > > Has anybody tried to run multiple RegionServers on a single physical> > > node? Are there deep technical issues or minor impediments that would> > > hinder this?> > >> > > We are trying to do this because we are facing a lot of GC pauses on> the> > > large heap sizes (~70G) that we are using, which leads to a lot of> > timeouts> > > in our latency critical application. More processes with smaller heaps> > > would help in mitigating this issue.> > >> > > Any experience or thoughts on this would help.> > > Thanks!> > >> > > --> > > *Ishan Chhabra *| Rocket Scientist | Rocketfuel Inc. | *m *650 556> 6803> > >> >> >> >> > --> >> > Robert Dyer> > [EMAIL PROTECTED]> >>

Hi Xieliang,You have put in an interesting set of GC optimizations, similar to what Iconcluded after extensive GC tuning recently. For latency criticalapplications running on modern servers with large rams and multicore CPUs,the key seems to be in minimizing stop the world causes cause by Young GC,CMS initial-mark and CMS remark. Your GC options seems to capture that verywell. Thanks for sharing!On Tue, Dec 11, 2012 at 12:42 AM, 谢良 <[EMAIL PROTECTED]> wrote: