We want to implement a bill query system. We have 20M users, the billfor each user per month contains about 10 0.6K-byte records. We wantto store user bill for 6 months. Of course, user query focused on thelatest month reports. But, the user to be queried doesn't have hotspot.

We use CDH3U0 with 6 servers (each with 24G mem and 3 1T disk) fordata node and region server (besides the ZK, namenode and hmasterservers). RS heap is 8G and DN is 12G. HFile max size is 1G. Theblock cache is 0.4.

The row key is month+user_id. Each record is stored as a cell. So, amonth report per user is a row in HBase.

Currently, to store bill records, we can achieve about 30K record/second.

However, the query performance is quite poor. We can only achieveabout 600~700 month_report/second. That is, each region server canonly serve query for about 100 row/second. Block cache hit ratio isabout 20%.

> And we also tried to disable block cache, it seems the performance is> even a little bit better. And it we use the configuration 6 DN servers> + 3 RS servers, we can get better throughput at about 1000> month_report/second. I am confused. Can any one explain the reason?>

So, you mean I shall disable block cache and make all query directly to DFS?

Then, the query latency maybe high.

And how much block cache hit ratio is considered to be acceptable? Imean, above such ratio, block cache is benefical.

2011/4/26 Ted Dunning <[EMAIL PROTECTED]>:> Because of your key organization you are blowing away your cache anyway so> it isn't doing you any good.>> On Mon, Apr 25, 2011 at 7:59 PM, Weihua JIANG <[EMAIL PROTECTED]>wrote:>>> And we also tried to disable block cache, it seems the performance is>> even a little bit better. And it we use the configuration 6 DN servers>> + 3 RS servers, we can get better throughput at about 1000>> month_report/second. I am confused. Can any one explain the reason?>>>

> However, the query performance is quite poor. We can only achieve> about 600~700 month_report/second. That is, each region server can> only serve query for about 100 row/second. Block cache hit ratio is> about 20%.>

This is random accesses? Why random accesses and not scans?> Do you have any advice on how to improve the query performance?>

This is hard to read but I don't see anything obnoxious.>> And we also tried to disable block cache, it seems the performance is> even a little bit better. And it we use the configuration 6 DN servers> + 3 RS servers, we can get better throughput at about 1000> month_report/second. I am confused. Can any one explain the reason?>

The query is all random read. The scenario is that a user want toquery his own monthly bill report, e.g. to query what happened on hisbill in March, or Feb, etc. Since every user may want to do so, wecan't predict who will be the next to ask for such monthly billreport.

I haven't done much testing changing the DN heap, but in my experienceit's not really of use to have 12GB there since the data never goesthrough the DN. Max 2GB maybe, give the rest to the region server oreven the OS cache (ie don't allocate some GBs on purpose).

>From what I see in your other responses, it appears that most of yourperformance testing was done in a black box fashion. Did you try eventry looking into where the bottle neck is? If not, then how could weeven be able to tell you why 3 RS would be faster than 6 apart fromdoing educated guesses?

As far as I can tell, you might want to see if some region server isserving most of the load or not. If it is, is it because of poorregion balancing (all the hottest regions at the same place) orbecause of poor key design (all the reads hit only one region). That'sjust one thing to look at.

As for our test, I have enabled HPROF on client side to see whathappens. According to HPROF, client spent 96% time on epollWait (about79% on waiting for RS response and 17% on communication with ZK).

I tried to enable HPROF on RS, but failed. If I added the HPROF agentin hbase-env.sh, RS startup reports an error said HPROF can't beloaded twice. But, I am sure I only enabled it once. I don't knowwhere the problem is.

So, for RS, I have to work in a black-box style to only analyze the RSperformance metrics to see what happens.

And, in our test, the balance is OK. When 6 RS in service and thetotal TPS is 600, then according to status on HMaster web page, eachRS handles about 100 requests.

ThanksWeihua

2011/4/27 Jean-Daniel Cryans <[EMAIL PROTECTED]>:>> servers). RS heap is 8G and DN is 12G.>> I haven't done much testing changing the DN heap, but in my experience> it's not really of use to have 12GB there since the data never goes> through the DN. Max 2GB maybe, give the rest to the region server or> even the OS cache (ie don't allocate some GBs on purpose).>> From what I see in your other responses, it appears that most of your> performance testing was done in a black box fashion. Did you try even> try looking into where the bottle neck is? If not, then how could we> even be able to tell you why 3 RS would be faster than 6 apart from> doing educated guesses?>> As far as I can tell, you might want to see if some region server is> serving most of the load or not. If it is, is it because of poor> region balancing (all the hottest regions at the same place) or> because of poor key design (all the reads hit only one region). That's> just one thing to look at.>> J-D>

On Tue, Apr 26, 2011 at 6:02 PM, Weihua JIANG <[EMAIL PROTECTED]> wrote:> I tried to enable HPROF on RS, but failed. If I added the HPROF agent> in hbase-env.sh, RS startup reports an error said HPROF can't be> loaded twice. But, I am sure I only enabled it once. I don't know> where the problem is.>

This sounds like 'HBASE-3561 OPTS arguments are duplicated' Are yourunning 0.90.2?

After solving HBASE-3561, I successfully run hprof for RS and DN.Since block cache is useless in my case, I disabled it. I rerun mytest with 14 RS+DNs and 1 client with 200 threads. But, the throughputis still only about 700. No scalability shown in this case.

I increased client thread to 2000 to put enough pressure on cluster. Idisabled RS block cache. The total TPS is still low (with Month+Useras row key, it is about 1300 for 10 RS+DN and with User+Month it is700).

I used BTrace to log the time spent on each HTable.get on RS. It showsthat most of the GETs use 20~50ms and there are many GETs need>1000ms. And almost all these times are spent on DFSClient$BlockReaderto read data from DN. But, the network usage is not high (<100Mb/s, wehave a giganet), so network is not a problem.

Since for each DFS block read, there is a socket connection created. Iuse netstat to caculate the TCP connections on 50010 port (DN listenport) for each RS+DN server. It shows that there are always one or twoDNs have high such connection number (>200) while other DNs have lownumber (<20). And the high connection DNs have high disk I/O usage(about 100%) while other DNs have low disk I/O. This phenoma lastsfor days and the hot machine is always the hot one.

The high connection number mainly comes from local region serverrequest (~80%).

According to the source code of DFSClient, it prefers to use local DNto fetch block. But, why certain machine is so popular? All my servershave almost the same configuration.

Can you figure the most popular blocks requested? You could figurewhich files they belong too by grepping the blocks in namenode log.

It is odd that you have the sort of a request profile if your loadingwas even. I'd expect the DN distribution to be even.

Sounds like hdfs-347 would help for sure.

St.AckOn Tue, May 17, 2011 at 6:57 AM, Weihua JIANG <[EMAIL PROTECTED]> wrote:> No. The key is generated randomly. In theory, it shall distributed to> all the RSs equally.>> Thanks> Weihua>> 2011/5/17 Ted Dunning <[EMAIL PROTECTED]>:>> Are your keys arranged so that you have a problem with a hot region?>>>> On Mon, May 16, 2011 at 11:18 PM, Weihua JIANG <[EMAIL PROTECTED]>wrote:>>>>> I have not applied hdfs-347, but done some other experiments.>>>>>> I increased client thread to 2000 to put enough pressure on cluster. I>>> disabled RS block cache. The total TPS is still low (with Month+User>>> as row key, it is about 1300 for 10 RS+DN and with User+Month it is>>> 700).>>>>>> I used BTrace to log the time spent on each HTable.get on RS. It shows>>> that most of the GETs use 20~50ms and there are many GETs need>>> >1000ms. And almost all these times are spent on DFSClient$BlockReader>>> to read data from DN. But, the network usage is not high (<100Mb/s, we>>> have a giganet), so network is not a problem.>>>>>> Since for each DFS block read, there is a socket connection created. I>>> use netstat to caculate the TCP connections on 50010 port (DN listen>>> port) for each RS+DN server. It shows that there are always one or two>>> DNs have high such connection number (>200) while other DNs have low>>> number (<20). And the high connection DNs have high disk I/O usage>>> (about 100%) while other DNs have low disk I/O. This phenoma lasts>>> for days and the hot machine is always the hot one.>>>>>> The high connection number mainly comes from local region server>>> request (~80%).>>>>>> According to the source code of DFSClient, it prefers to use local DN>>> to fetch block. But, why certain machine is so popular? All my servers>>> have almost the same configuration.>>>>>> 2011/4/29 Stack <[EMAIL PROTECTED]>:>>> > Yes, you could try applying hdfs-347 to your hdfs as J-D suggests. Do>>> > your numbers change if you run your client from more than one machine?>>> > St.Ack>>> >>>> > On Thu, Apr 28, 2011 at 2:56 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]>>>> wrote:>>> >> Seems to be a case of HDFS-347.>>> >>>>> >> J-D>>> >>>>> >> On Thu, Apr 28, 2011 at 12:55 AM, Weihua JIANG <[EMAIL PROTECTED]>>>> wrote:>>> >>> After solving HBASE-3561, I successfully run hprof for RS and DN.>>> >>> Since block cache is useless in my case, I disabled it. I rerun my>>> >>> test with 14 RS+DNs and 1 client with 200 threads. But, the throughput>>> >>> is still only about 700. No scalability shown in this case.>>> >>>>>> >>> Below is the hot spots in RS:>>> >>> CPU SAMPLES BEGIN (total = 1469756) Thu Apr 28 15:43:35 2011>>> >>> rank self accum count trace method>>> >>> 1 44.33% 44.33% 651504 300612 sun.nio.ch.EPollArrayWrapper.epollWait>>> >>> 2 19.88% 64.21% 292221 301351 sun.nio.ch.EPollArrayWrapper.epollWait>>> >>> 3 8.88% 73.09% 130582 300554 sun.nio.ch.EPollArrayWrapper.epollWait>>> >>> 4 4.43% 77.52% 65106 301248 sun.nio.ch.EPollArrayWrapper.epollWait>>> >>> 5 4.43% 81.95% 65104 301249 sun.nio.ch.EPollArrayWrapper.epollWait>>> >>> 6 4.43% 86.38% 65100 301247 sun.nio.ch.EPollArrayWrapper.epollWait>>> >>> 7 4.43% 90.81% 65061 301266 sun.nio.ch.EPollArrayWrapper.epollWait>>> >>> 8 4.32% 95.13% 63465 301565 sun.nio.ch.EPollArrayWrapper.epollWait>>> >>> 9 2.31% 97.43% 33894 301555 sun.nio.ch.EPollArrayWrapper.epollWait>>> >>> 10 1.76% 99.19% 25841 301588 sun.nio.ch.EPollArrayWrapper.epollWait>>> >>> 11 0.48% 99.67% 7025 301443sun.nio.ch.EPollArrayWrapper.epollWait>>> >>> 12 0.02% 99.69% 341 301568 sun.nio.ch.NativeThread.current

Have you applied the GC hints recommended by Todd L in his blog?Also you said:'And almost all these times are spent on DFSClient$BlockReaderto read data from DN.'What speed disks are you using and how many disks per node?(you could be blocked on disk i/o.)-Mike----------------------------------------> Date: Tue, 17 May 2011 07:33:34 -0700> Subject: Re: How to speedup Hbase query throughput> From: [EMAIL PROTECTED]> To: [EMAIL PROTECTED]>> Nice analysis.>> Can you figure the most popular blocks requested? You could figure> which files they belong too by grepping the blocks in namenode log.>> It is odd that you have the sort of a request profile if your loading> was even. I'd expect the DN distribution to be even.>> Sounds like hdfs-347 would help for sure.>> St.Ack>>> On Tue, May 17, 2011 at 6:57 AM, Weihua JIANG wrote:> > No. The key is generated randomly. In theory, it shall distributed to> > all the RSs equally.> >> > Thanks> > Weihua> >> > 2011/5/17 Ted Dunning :> >> Are your keys arranged so that you have a problem with a hot region?> >>> >> On Mon, May 16, 2011 at 11:18 PM, Weihua JIANG wrote:> >>> >>> I have not applied hdfs-347, but done some other experiments.> >>>> >>> I increased client thread to 2000 to put enough pressure on cluster. I> >>> disabled RS block cache. The total TPS is still low (with Month+User> >>> as row key, it is about 1300 for 10 RS+DN and with User+Month it is> >>> 700).> >>>> >>> I used BTrace to log the time spent on each HTable.get on RS. It shows> >>> that most of the GETs use 20~50ms and there are many GETs need> >>> >1000ms. And almost all these times are spent on DFSClient$BlockReader> >>> to read data from DN. But, the network usage is not high (<100Mb/s, we> >>> have a giganet), so network is not a problem.> >>>> >>> Since for each DFS block read, there is a socket connection created. I> >>> use netstat to caculate the TCP connections on 50010 port (DN listen> >>> port) for each RS+DN server. It shows that there are always one or two> >>> DNs have high such connection number (>200) while other DNs have low> >>> number (<20). And the high connection DNs have high disk I/O usage> >>> (about 100%) while other DNs have low disk I/O. This phenoma lasts> >>> for days and the hot machine is always the hot one.> >>>> >>> The high connection number mainly comes from local region server> >>> request (~80%).> >>>> >>> According to the source code of DFSClient, it prefers to use local DN> >>> to fetch block. But, why certain machine is so popular? All my servers> >>> have almost the same configuration.> >>>> >>> 2011/4/29 Stack :> >>> > Yes, you could try applying hdfs-347 to your hdfs as J-D suggests. Do> >>> > your numbers change if you run your client from more than one machine?> >>> > St.Ack> >>> >> >>> > On Thu, Apr 28, 2011 at 2:56 PM, Jean-Daniel Cryans > >>> wrote:> >>> >> Seems to be a case of HDFS-347.> >>> >>> >>> >> J-D> >>> >>> >>> >> On Thu, Apr 28, 2011 at 12:55 AM, Weihua JIANG > >>> wrote:> >>> >>> After solving HBASE-3561, I successfully run hprof for RS and DN.> >>> >>> Since block cache is useless in my case, I disabled it. I rerun my> >>> >>> test with 14 RS+DNs and 1 client with 200 threads. But, the throughput> >>> >>> is still only about 700. No scalability shown in this case.> >>> >>>> >>> >>> Below is the hot spots in RS:> >>> >>> CPU SAMPLES BEGIN (total = 1469756) Thu Apr 28 15:43:35 2011> >>> >>> rank self accum count trace method> >>> >>> 1 44.33% 44.33% 651504 300612 sun.nio.ch.EPollArrayWrapper.epollWait> >>> >>> 2 19.88% 64.21% 292221 301351 sun.nio.ch.EPollArrayWrapper.epollWait> >>> >>> 3 8.88% 73.09% 130582 300554 sun.nio.ch.EPollArrayWrapper.epollWait

I generate the key in random and verified at client by grepping .META.table and record the mapping from each query to its serving regionserver. It shows that each RS serves almost the same number of queryrequests.

For GC hints, can you give me a link? I only found Todd's posts aboutGC tuning for write. But, in my case, I only perform query. So, theone I found seems no help to me.

ThanksWeihua

2011/5/17 Michael Segel <[EMAIL PROTECTED]>:>> Sorry to jump in on the tail end.>> What do you mean to say that they key is generated randomly?>> I mean are you using a key and then applying a SHA-1 hash?>> Which node is serving your -ROOT- and META tables?>> Have you applied the GC hints recommended by Todd L in his blog?>>> Also you said:> '> And almost all these times are spent on DFSClient$BlockReader> to read data from DN.> '> What speed disks are you using and how many disks per node?> (you could be blocked on disk i/o.)>>> -Mike>>> ---------------------------------------->> Date: Tue, 17 May 2011 07:33:34 -0700>> Subject: Re: How to speedup Hbase query throughput>> From: [EMAIL PROTECTED]>> To: [EMAIL PROTECTED]>>>> Nice analysis.>>>> Can you figure the most popular blocks requested? You could figure>> which files they belong too by grepping the blocks in namenode log.>>>> It is odd that you have the sort of a request profile if your loading>> was even. I'd expect the DN distribution to be even.>>>> Sounds like hdfs-347 would help for sure.>>>> St.Ack>>>>>> On Tue, May 17, 2011 at 6:57 AM, Weihua JIANG wrote:>> > No. The key is generated randomly. In theory, it shall distributed to>> > all the RSs equally.>> >>> > Thanks>> > Weihua>> >>> > 2011/5/17 Ted Dunning :>> >> Are your keys arranged so that you have a problem with a hot region?>> >>>> >> On Mon, May 16, 2011 at 11:18 PM, Weihua JIANG wrote:>> >>>> >>> I have not applied hdfs-347, but done some other experiments.>> >>>>> >>> I increased client thread to 2000 to put enough pressure on cluster. I>> >>> disabled RS block cache. The total TPS is still low (with Month+User>> >>> as row key, it is about 1300 for 10 RS+DN and with User+Month it is>> >>> 700).>> >>>>> >>> I used BTrace to log the time spent on each HTable.get on RS. It shows>> >>> that most of the GETs use 20~50ms and there are many GETs need>> >>> >1000ms. And almost all these times are spent on DFSClient$BlockReader>> >>> to read data from DN. But, the network usage is not high (<100Mb/s, we>> >>> have a giganet), so network is not a problem.>> >>>>> >>> Since for each DFS block read, there is a socket connection created. I>> >>> use netstat to caculate the TCP connections on 50010 port (DN listen>> >>> port) for each RS+DN server. It shows that there are always one or two>> >>> DNs have high such connection number (>200) while other DNs have low>> >>> number (<20). And the high connection DNs have high disk I/O usage>> >>> (about 100%) while other DNs have low disk I/O. This phenoma lasts>> >>> for days and the hot machine is always the hot one.>> >>>>> >>> The high connection number mainly comes from local region server>> >>> request (~80%).>> >>>>> >>> According to the source code of DFSClient, it prefers to use local DN>> >>> to fetch block. But, why certain machine is so popular? All my servers>> >>> have almost the same configuration.>> >>>>> >>> 2011/4/29 Stack :>> >>> > Yes, you could try applying hdfs-347 to your hdfs as J-D suggests. Do>> >>> > your numbers change if you run your client from more than one machine?>> >>> > St.Ack>> >>> >>> >>> > On Thu, Apr 28, 2011 at 2:56 PM, Jean-Daniel Cryans>> >>> wrote:>> >>> >> Seems to be a case of HDFS-347.>> >>> >>>> >>> >> J-D>> >>> >>>> >>> >> On Thu, Apr 28, 2011 at 12:55 AM, Weihua JIANG>> >>> wrote:>> >>> >>> After solving HBASE-3561, I successfully run hprof for RS and DN.

Are there more blocks on these hot DNs than there are on the coolones? If you run a major compaction and then run your tests, does itmake a difference?St.Ack

On Tue, May 17, 2011 at 8:03 PM, Weihua JIANG <[EMAIL PROTECTED]> wrote:> -ROOT- and .META. table are not served by these hot region servers.>> I generate the key in random and verified at client by grepping .META.> table and record the mapping from each query to its serving region> server. It shows that each RS serves almost the same number of query> requests.>> For GC hints, can you give me a link? I only found Todd's posts about> GC tuning for write. But, in my case, I only perform query. So, the> one I found seems no help to me.>> Thanks> Weihua>> 2011/5/17 Michael Segel <[EMAIL PROTECTED]>:>>>> Sorry to jump in on the tail end.>>>> What do you mean to say that they key is generated randomly?>>>> I mean are you using a key and then applying a SHA-1 hash?>>>> Which node is serving your -ROOT- and META tables?>>>> Have you applied the GC hints recommended by Todd L in his blog?>>>>>> Also you said:>> '>> And almost all these times are spent on DFSClient$BlockReader>> to read data from DN.>> '>> What speed disks are you using and how many disks per node?>> (you could be blocked on disk i/o.)>>>>>> -Mike>>>>>> ---------------------------------------->>> Date: Tue, 17 May 2011 07:33:34 -0700>>> Subject: Re: How to speedup Hbase query throughput>>> From: [EMAIL PROTECTED]>>> To: [EMAIL PROTECTED]>>>>>> Nice analysis.>>>>>> Can you figure the most popular blocks requested? You could figure>>> which files they belong too by grepping the blocks in namenode log.>>>>>> It is odd that you have the sort of a request profile if your loading>>> was even. I'd expect the DN distribution to be even.>>>>>> Sounds like hdfs-347 would help for sure.>>>>>> St.Ack>>>>>>>>> On Tue, May 17, 2011 at 6:57 AM, Weihua JIANG wrote:>>> > No. The key is generated randomly. In theory, it shall distributed to>>> > all the RSs equally.>>> >>>> > Thanks>>> > Weihua>>> >>>> > 2011/5/17 Ted Dunning :>>> >> Are your keys arranged so that you have a problem with a hot region?>>> >>>>> >> On Mon, May 16, 2011 at 11:18 PM, Weihua JIANG wrote:>>> >>>>> >>> I have not applied hdfs-347, but done some other experiments.>>> >>>>>> >>> I increased client thread to 2000 to put enough pressure on cluster. I>>> >>> disabled RS block cache. The total TPS is still low (with Month+User>>> >>> as row key, it is about 1300 for 10 RS+DN and with User+Month it is>>> >>> 700).>>> >>>>>> >>> I used BTrace to log the time spent on each HTable.get on RS. It shows>>> >>> that most of the GETs use 20~50ms and there are many GETs need>>> >>> >1000ms. And almost all these times are spent on DFSClient$BlockReader>>> >>> to read data from DN. But, the network usage is not high (<100Mb/s, we>>> >>> have a giganet), so network is not a problem.>>> >>>>>> >>> Since for each DFS block read, there is a socket connection created. I>>> >>> use netstat to caculate the TCP connections on 50010 port (DN listen>>> >>> port) for each RS+DN server. It shows that there are always one or two>>> >>> DNs have high such connection number (>200) while other DNs have low>>> >>> number (<20). And the high connection DNs have high disk I/O usage>>> >>> (about 100%) while other DNs have low disk I/O. This phenoma lasts>>> >>> for days and the hot machine is always the hot one.>>> >>>>>> >>> The high connection number mainly comes from local region server>>> >>> request (~80%).>>> >>>>>> >>> According to the source code of DFSClient, it prefers to use local DN>>> >>> to fetch block. But, why certain machine is so popular? All my servers>>> >>> have almost the same configuration.>>> >>>>>> >>> 2011/4/29 Stack :>>> >>> > Yes, you could try applying hdfs-347 to your hdfs as J-D suggests. Do>>> >>> > your numbers change if you run your client from more than one machine?

All the DNs almost have the same number of blocks. Major compactionmakes no difference.

ThanksWeihua

2011/5/18 Stack <[EMAIL PROTECTED]>:> Are there more blocks on these hot DNs than there are on the cool> ones? If you run a major compaction and then run your tests, does it> make a difference?> St.Ack>> On Tue, May 17, 2011 at 8:03 PM, Weihua JIANG <[EMAIL PROTECTED]> wrote:>> -ROOT- and .META. table are not served by these hot region servers.>>>> I generate the key in random and verified at client by grepping .META.>> table and record the mapping from each query to its serving region>> server. It shows that each RS serves almost the same number of query>> requests.>>>> For GC hints, can you give me a link? I only found Todd's posts about>> GC tuning for write. But, in my case, I only perform query. So, the>> one I found seems no help to me.>>>> Thanks>> Weihua>>>> 2011/5/17 Michael Segel <[EMAIL PROTECTED]>:>>>>>> Sorry to jump in on the tail end.>>>>>> What do you mean to say that they key is generated randomly?>>>>>> I mean are you using a key and then applying a SHA-1 hash?>>>>>> Which node is serving your -ROOT- and META tables?>>>>>> Have you applied the GC hints recommended by Todd L in his blog?>>>>>>>>> Also you said:>>> '>>> And almost all these times are spent on DFSClient$BlockReader>>> to read data from DN.>>> '>>> What speed disks are you using and how many disks per node?>>> (you could be blocked on disk i/o.)>>>>>>>>> -Mike>>>>>>>>> ---------------------------------------->>>> Date: Tue, 17 May 2011 07:33:34 -0700>>>> Subject: Re: How to speedup Hbase query throughput>>>> From: [EMAIL PROTECTED]>>>> To: [EMAIL PROTECTED]>>>>>>>> Nice analysis.>>>>>>>> Can you figure the most popular blocks requested? You could figure>>>> which files they belong too by grepping the blocks in namenode log.>>>>>>>> It is odd that you have the sort of a request profile if your loading>>>> was even. I'd expect the DN distribution to be even.>>>>>>>> Sounds like hdfs-347 would help for sure.>>>>>>>> St.Ack>>>>>>>>>>>> On Tue, May 17, 2011 at 6:57 AM, Weihua JIANG wrote:>>>> > No. The key is generated randomly. In theory, it shall distributed to>>>> > all the RSs equally.>>>> >>>>> > Thanks>>>> > Weihua>>>> >>>>> > 2011/5/17 Ted Dunning :>>>> >> Are your keys arranged so that you have a problem with a hot region?>>>> >>>>>> >> On Mon, May 16, 2011 at 11:18 PM, Weihua JIANG wrote:>>>> >>>>>> >>> I have not applied hdfs-347, but done some other experiments.>>>> >>>>>>> >>> I increased client thread to 2000 to put enough pressure on cluster. I>>>> >>> disabled RS block cache. The total TPS is still low (with Month+User>>>> >>> as row key, it is about 1300 for 10 RS+DN and with User+Month it is>>>> >>> 700).>>>> >>>>>>> >>> I used BTrace to log the time spent on each HTable.get on RS. It shows>>>> >>> that most of the GETs use 20~50ms and there are many GETs need>>>> >>> >1000ms. And almost all these times are spent on DFSClient$BlockReader>>>> >>> to read data from DN. But, the network usage is not high (<100Mb/s, we>>>> >>> have a giganet), so network is not a problem.>>>> >>>>>>> >>> Since for each DFS block read, there is a socket connection created. I>>>> >>> use netstat to caculate the TCP connections on 50010 port (DN listen>>>> >>> port) for each RS+DN server. It shows that there are always one or two>>>> >>> DNs have high such connection number (>200) while other DNs have low>>>> >>> number (<20). And the high connection DNs have high disk I/O usage>>>> >>> (about 100%) while other DNs have low disk I/O. This phenoma lasts>>>> >>> for days and the hot machine is always the hot one.>>>> >>>>>>> >>> The high connection number mainly comes from local region server>>>> >>> request (~80%).>>>> >>>>>>> >>> According to the source code of DFSClient, it prefers to use local DN>>>> >>> to fetch block. But, why certain machine is so popular? All my servers

I had asked the question about how he created random keys... Hadn't seen a response.

Sent from a remote device. Please excuse any typos...

Mike Segel

On May 18, 2011, at 11:27 PM, Stack <[EMAIL PROTECTED]> wrote:

> On Wed, May 18, 2011 at 5:11 PM, Weihua JIANG <[EMAIL PROTECTED]> wrote:>> All the DNs almost have the same number of blocks. Major compaction>> makes no difference.>> > > I would expect major compaction to even the number of blocks across> the cluster and it'd move the data for each region local to the> regionserver.> > The only explanation that I can see is that the hot DNs must be> carrying the hot blocks (The client querys are not random). I do not> know what else it could be.> > St.Ack>

I wanted to do some more investigation before posting to the list, but itseems relevant to this conversation...

Is it possible that major compactions don't always localize the data blocks? Our cluster had a bunch of regions full of historical analytics data thatwere already major compacted, then we added a new datanode/regionserver. Wehave a job that triggers major compactions at a minimum of once per week byhashing the region name and giving it a time slot. It's been several weeksand the original nodes each have ~480gb used in hdfs, while the new node hasonly 240gb. Regions are scattered pretty randomly and evenly among theregionservers.

The job calls hBaseAdmin.majorCompact(hRegionInfo.getRegionName());

My guess is that if a region is already major compacted and no new data hasbeen added to it, then compaction is skipped. That's definitely anessential feature during typical operation, but it's a problem if you'rerelying on major compaction to balance the cluster.

> I had asked the question about how he created random keys... Hadn't seen a> response.>> Sent from a remote device. Please excuse any typos...>> Mike Segel>> On May 18, 2011, at 11:27 PM, Stack <[EMAIL PROTECTED]> wrote:>> > On Wed, May 18, 2011 at 5:11 PM, Weihua JIANG <[EMAIL PROTECTED]>> wrote:> >> All the DNs almost have the same number of blocks. Major compaction> >> makes no difference.> >>> >> > I would expect major compaction to even the number of blocks across> > the cluster and it'd move the data for each region local to the> > regionserver.> >> > The only explanation that I can see is that the hot DNs must be> > carrying the hot blocks (The client querys are not random). I do not> > know what else it could be.> >> > St.Ack> >>

Am I right to assume that all of your data is in HBase, ie you don'tkeep anything in just HDFS files?

-Joey

On Thu, May 19, 2011 at 8:15 AM, Matt Corgan <[EMAIL PROTECTED]> wrote:> I wanted to do some more investigation before posting to the list, but it> seems relevant to this conversation...>> Is it possible that major compactions don't always localize the data blocks?> Our cluster had a bunch of regions full of historical analytics data that> were already major compacted, then we added a new datanode/regionserver. We> have a job that triggers major compactions at a minimum of once per week by> hashing the region name and giving it a time slot. It's been several weeks> and the original nodes each have ~480gb used in hdfs, while the new node has> only 240gb. Regions are scattered pretty randomly and evenly among the> regionservers.>> The job calls hBaseAdmin.majorCompact(hRegionInfo.getRegionName());>> My guess is that if a region is already major compacted and no new data has> been added to it, then compaction is skipped. That's definitely an> essential feature during typical operation, but it's a problem if you're> relying on major compaction to balance the cluster.>> Matt>>> On Thu, May 19, 2011 at 4:42 AM, Michel Segel <[EMAIL PROTECTED]>wrote:>>> I had asked the question about how he created random keys... Hadn't seen a>> response.>>>> Sent from a remote device. Please excuse any typos...>>>> Mike Segel>>>> On May 18, 2011, at 11:27 PM, Stack <[EMAIL PROTECTED]> wrote:>>>> > On Wed, May 18, 2011 at 5:11 PM, Weihua JIANG <[EMAIL PROTECTED]>>> wrote:>> >> All the DNs almost have the same number of blocks. Major compaction>> >> makes no difference.>> >>>> >>> > I would expect major compaction to even the number of blocks across>> > the cluster and it'd move the data for each region local to the>> > regionserver.>> >>> > The only explanation that I can see is that the hot DNs must be>> > carrying the hot blocks (The client querys are not random). I do not>> > know what else it could be.>> >>> > St.Ack>> >>>>

> Am I right to assume that all of your data is in HBase, ie you don't> keep anything in just HDFS files?>> -Joey>> On Thu, May 19, 2011 at 8:15 AM, Matt Corgan <[EMAIL PROTECTED]> wrote:> > I wanted to do some more investigation before posting to the list, but it> > seems relevant to this conversation...> >> > Is it possible that major compactions don't always localize the data> blocks?> > Our cluster had a bunch of regions full of historical analytics data> that> > were already major compacted, then we added a new datanode/regionserver.> We> > have a job that triggers major compactions at a minimum of once per week> by> > hashing the region name and giving it a time slot. It's been several> weeks> > and the original nodes each have ~480gb used in hdfs, while the new node> has> > only 240gb. Regions are scattered pretty randomly and evenly among the> > regionservers.> >> > The job calls hBaseAdmin.majorCompact(hRegionInfo.getRegionName());> >> > My guess is that if a region is already major compacted and no new data> has> > been added to it, then compaction is skipped. That's definitely an> > essential feature during typical operation, but it's a problem if you're> > relying on major compaction to balance the cluster.> >> > Matt> >> >> > On Thu, May 19, 2011 at 4:42 AM, Michel Segel <[EMAIL PROTECTED]> >wrote:> >> >> I had asked the question about how he created random keys... Hadn't seen> a> >> response.> >>> >> Sent from a remote device. Please excuse any typos...> >>> >> Mike Segel> >>> >> On May 18, 2011, at 11:27 PM, Stack <[EMAIL PROTECTED]> wrote:> >>> >> > On Wed, May 18, 2011 at 5:11 PM, Weihua JIANG <[EMAIL PROTECTED]> >> >> wrote:> >> >> All the DNs almost have the same number of blocks. Major compaction> >> >> makes no difference.> >> >>> >> >> >> > I would expect major compaction to even the number of blocks across> >> > the cluster and it'd move the data for each region local to the> >> > regionserver.> >> >> >> > The only explanation that I can see is that the hot DNs must be> >> > carrying the hot blocks (The client querys are not random). I do not> >> > know what else it could be.> >> >> >> > St.Ack> >> >> >>> >>>>> --> Joseph Echeverria> Cloudera, Inc.> 443.305.9434>

I'm surprised the major compactions didn't balance the cluster better.I wonder if you've stumbled upon a bug in HBase that's causing it toleak old HFiles.

Is the total amount of data in HDFS what you expect?

-Joey

On Thu, May 19, 2011 at 8:35 AM, Matt Corgan <[EMAIL PROTECTED]> wrote:> that's right>>> On Thu, May 19, 2011 at 8:23 AM, Joey Echeverria <[EMAIL PROTECTED]> wrote:>>> Am I right to assume that all of your data is in HBase, ie you don't>> keep anything in just HDFS files?>>>> -Joey>>>> On Thu, May 19, 2011 at 8:15 AM, Matt Corgan <[EMAIL PROTECTED]> wrote:>> > I wanted to do some more investigation before posting to the list, but it>> > seems relevant to this conversation...>> >>> > Is it possible that major compactions don't always localize the data>> blocks?>> > Our cluster had a bunch of regions full of historical analytics data>> that>> > were already major compacted, then we added a new datanode/regionserver.>> We>> > have a job that triggers major compactions at a minimum of once per week>> by>> > hashing the region name and giving it a time slot. It's been several>> weeks>> > and the original nodes each have ~480gb used in hdfs, while the new node>> has>> > only 240gb. Regions are scattered pretty randomly and evenly among the>> > regionservers.>> >>> > The job calls hBaseAdmin.majorCompact(hRegionInfo.getRegionName());>> >>> > My guess is that if a region is already major compacted and no new data>> has>> > been added to it, then compaction is skipped. That's definitely an>> > essential feature during typical operation, but it's a problem if you're>> > relying on major compaction to balance the cluster.>> >>> > Matt>> >>> >>> > On Thu, May 19, 2011 at 4:42 AM, Michel Segel <[EMAIL PROTECTED]>> >wrote:>> >>> >> I had asked the question about how he created random keys... Hadn't seen>> a>> >> response.>> >>>> >> Sent from a remote device. Please excuse any typos...>> >>>> >> Mike Segel>> >>>> >> On May 18, 2011, at 11:27 PM, Stack <[EMAIL PROTECTED]> wrote:>> >>>> >> > On Wed, May 18, 2011 at 5:11 PM, Weihua JIANG <[EMAIL PROTECTED]>> >>> >> wrote:>> >> >> All the DNs almost have the same number of blocks. Major compaction>> >> >> makes no difference.>> >> >>>> >> >>> >> > I would expect major compaction to even the number of blocks across>> >> > the cluster and it'd move the data for each region local to the>> >> > regionserver.>> >> >>> >> > The only explanation that I can see is that the hot DNs must be>> >> > carrying the hot blocks (The client querys are not random). I do not>> >> > know what else it could be.>> >> >>> >> > St.Ack>> >> >>> >>>> >>>>>>>>> -->> Joseph Echeverria>> Cloudera, Inc.>> 443.305.9434>>>

I think i traced this to a bug in my compaction scheduler that would havemissed scheduling about half the regions, hence the 240gb vs 480gb. Toconfirm: major compaction will always run when asked, even if the region isalready major compacted, the table settings haven't changed, and it was lastmajor compacted on that same server. [potential hbase optimization here forclusters with many cold regions]. So my theory about not localizing blocksis false.

Weihua - why do you think your throughput doubled when you went fromuser+month to month+user keys? Are your queries using an even distributionof months? I'm not exactly clear on your schema or query pattern.On Thu, May 19, 2011 at 8:39 AM, Joey Echeverria <[EMAIL PROTECTED]> wrote:

> I'm surprised the major compactions didn't balance the cluster better.> I wonder if you've stumbled upon a bug in HBase that's causing it to> leak old HFiles.>> Is the total amount of data in HDFS what you expect?>> -Joey>> On Thu, May 19, 2011 at 8:35 AM, Matt Corgan <[EMAIL PROTECTED]> wrote:> > that's right> >> >> > On Thu, May 19, 2011 at 8:23 AM, Joey Echeverria <[EMAIL PROTECTED]>> wrote:> >> >> Am I right to assume that all of your data is in HBase, ie you don't> >> keep anything in just HDFS files?> >>> >> -Joey> >>> >> On Thu, May 19, 2011 at 8:15 AM, Matt Corgan <[EMAIL PROTECTED]>> wrote:> >> > I wanted to do some more investigation before posting to the list, but> it> >> > seems relevant to this conversation...> >> >> >> > Is it possible that major compactions don't always localize the data> >> blocks?> >> > Our cluster had a bunch of regions full of historical analytics data> >> that> >> > were already major compacted, then we added a new> datanode/regionserver.> >> We> >> > have a job that triggers major compactions at a minimum of once per> week> >> by> >> > hashing the region name and giving it a time slot. It's been several> >> weeks> >> > and the original nodes each have ~480gb used in hdfs, while the new> node> >> has> >> > only 240gb. Regions are scattered pretty randomly and evenly among> the> >> > regionservers.> >> >> >> > The job calls hBaseAdmin.majorCompact(hRegionInfo.getRegionName());> >> >> >> > My guess is that if a region is already major compacted and no new> data> >> has> >> > been added to it, then compaction is skipped. That's definitely an> >> > essential feature during typical operation, but it's a problem if> you're> >> > relying on major compaction to balance the cluster.> >> >> >> > Matt> >> >> >> >> >> > On Thu, May 19, 2011 at 4:42 AM, Michel Segel <> [EMAIL PROTECTED]> >> >wrote:> >> >> >> >> I had asked the question about how he created random keys... Hadn't> seen> >> a> >> >> response.> >> >>> >> >> Sent from a remote device. Please excuse any typos...> >> >>> >> >> Mike Segel> >> >>> >> >> On May 18, 2011, at 11:27 PM, Stack <[EMAIL PROTECTED]> wrote:> >> >>> >> >> > On Wed, May 18, 2011 at 5:11 PM, Weihua JIANG <> [EMAIL PROTECTED]> >> >> >> >> wrote:> >> >> >> All the DNs almost have the same number of blocks. Major> compaction> >> >> >> makes no difference.> >> >> >>> >> >> >> >> >> > I would expect major compaction to even the number of blocks across> >> >> > the cluster and it'd move the data for each region local to the> >> >> > regionserver.> >> >> >> >> >> > The only explanation that I can see is that the hot DNs must be> >> >> > carrying the hot blocks (The client querys are not random). I do> not> >> >> > know what else it could be.> >> >> >> >> >> > St.Ack> >> >> >> >> >>> >> >> >>> >>> >>> >> --> >> Joseph Echeverria> >> Cloudera, Inc.> >> 443.305.9434> >>> >>>>> --> Joseph Echeverria> Cloudera, Inc.> 443.305.9434>

We assume user is more interested in his latest bills than his oldbills. Thus, the query generator is worked as below:1. randomly generate a number and reverse it as user id.2. randomly generate a prioritied month based on the above assumpation.3. ask HBase to query this user + month.

ThanksWeihua

2011/5/20 Matt Corgan <[EMAIL PROTECTED]>:> I think i traced this to a bug in my compaction scheduler that would have> missed scheduling about half the regions, hence the 240gb vs 480gb. ��To> confirm: major compaction will always run when asked, even if the region is> already major compacted, the table settings haven't changed, and it was last> major compacted on that same server. [potential hbase optimization here for> clusters with many cold regions]. So my theory about not localizing blocks> is false.>> Weihua - why do you think your throughput doubled when you went from> user+month to month+user keys? Are your queries using an even distribution> of months? I'm not exactly clear on your schema or query pattern.>>> On Thu, May 19, 2011 at 8:39 AM, Joey Echeverria <[EMAIL PROTECTED]> wrote:>>> I'm surprised the major compactions didn't balance the cluster better.>> I wonder if you've stumbled upon a bug in HBase that's causing it to>> leak old HFiles.>>>> Is the total amount of data in HDFS what you expect?>>>> -Joey>>>> On Thu, May 19, 2011 at 8:35 AM, Matt Corgan <[EMAIL PROTECTED]> wrote:>> > that's right>> >>> >>> > On Thu, May 19, 2011 at 8:23 AM, Joey Echeverria <[EMAIL PROTECTED]>>> wrote:>> >>> >> Am I right to assume that all of your data is in HBase, ie you don't>> >> keep anything in just HDFS files?>> >>>> >> -Joey>> >>>> >> On Thu, May 19, 2011 at 8:15 AM, Matt Corgan <[EMAIL PROTECTED]>>> wrote:>> >> > I wanted to do some more investigation before posting to the list, but>> it>> >> > seems relevant to this conversation...>> >> >>> >> > Is it possible that major compactions don't always localize the data>> >> blocks?>> >> > Our cluster had a bunch of regions full of historical analytics data>> >> that>> >> > were already major compacted, then we added a new>> datanode/regionserver.>> >> We>> >> > have a job that triggers major compactions at a minimum of once per>> week>> >> by>> >> > hashing the region name and giving it a time slot. It's been several>> >> weeks>> >> > and the original nodes each have ~480gb used in hdfs, while the new>> node>> >> has>> >> > only 240gb. Regions are scattered pretty randomly and evenly among>> the>> >> > regionservers.>> >> >>> >> > The job calls hBaseAdmin.majorCompact(hRegionInfo.getRegionName());>> >> >>> >> > My guess is that if a region is already major compacted and no new>> data>> >> has>> >> > been added to it, then compaction is skipped. That's definitely an>> >> > essential feature during typical operation, but it's a problem if>> you're>> >> > relying on major compaction to balance the cluster.>> >> >>> >> > Matt>> >> >>> >> >>> >> > On Thu, May 19, 2011 at 4:42 AM, Michel Segel <>> [EMAIL PROTECTED]>> >> >wrote:>> >> >>> >> >> I had asked the question about how he created random keys... Hadn't>> seen>> >> a>> >> >> response.>> >> >>>> >> >> Sent from a remote device. Please excuse any typos...>> >> >>>> >> >> Mike Segel>> >> >>>> >> >> On May 18, 2011, at 11:27 PM, Stack <[EMAIL PROTECTED]> wrote:>> >> >>>> >> >> > On Wed, May 18, 2011 at 5:11 PM, Weihua JIANG <>> [EMAIL PROTECTED]>> >> >>> >> >> wrote:>> >> >> >> All the DNs almost have the same number of blocks. Major>> compaction>> >> >> >> makes no difference.>> >> >> >>>> >> >> >>> >> >> > I would expect major compaction to even the number of blocks across>> >> >> > the cluster and it'd move the data for each region local to the>> >> >> > regionserver.>> >> >> >>> >> >> > The only explanation that I can see is that the hot DNs must be>> >> >> > carrying the hot blocks (The client querys are not random). ��I do

Ok.This why I asked you earlier about how you were generating your user ids.

You're not going to get a good distribution.

First, random numbers usually aren't that random.

How many users do you want to simulate?Try this...Create n number of type 5 uuids. These are uuids that have been generated, then hashed using a SHA-1hashing algo, and then truncated to the right number of bits.

This will give you a more realistic random distribution of user ids. Note that you will have to remember the user ids! It will also be alpha numeric. Then you can use your 'month' as part of your key. However... I have to question your design again. Your billing by months means that you will only have 12 months of data and the data generation really isn't random. Meaning you don't generate your data out of sequence.

Just a suggestion... It sounds like you're trying to simulate queries where users get created mid stream and don't always stick around. So when you create a user, you can also simulate his start/join date and his end date and then generate his 'billing' information. I would suggest that instead of using a random number for billing month that you actually create your own time stamp...

I am also assuming that you are generating the data first and then running queries against a static data set?

If this is true, and you create both the uuids and then the billing data, you'll get a better random data set that is going to be more realistic...

Having said all of this...

You have a couple of options..

First you can make your key month+userid, assuming you only have 12 months of data.Or you can make your key userid+month. This has the additional benefit of collocating your user's data.

Or you could choose a third option....You are trying to retrieve a user's billing data. This could be an object. So you could store the bill as a column in a table where the column id is the timestamp of the bill.

If you want the last date first, you can do a simple trick... If you are using months... make the column id 99 - the month so that your data is in reverse order.

Sent from a remote device. Please excuse any typos...

Mike Segel

On May 19, 2011, at 7:08 PM, Weihua JIANG <[EMAIL PROTECTED]> wrote:

> Sorry for missing the background.> > We assume user is more interested in his latest bills than his old> bills. Thus, the query generator is worked as below:> 1. randomly generate a number and reverse it as user id.> 2. randomly generate a prioritied month based on the above assumpation.> 3. ask HBase to query this user + month.> > Thanks> Weihua> > 2011/5/20 Matt Corgan <[EMAIL PROTECTED]>:>> I think i traced this to a bug in my compaction scheduler that would have>> missed scheduling about half the regions, hence the 240gb vs 480gb. To>> confirm: major compaction will always run when asked, even if the region is>> already major compacted, the table settings haven't changed, and it was last>> major compacted on that same server. [potential hbase optimization here for>> clusters with many cold regions]. So my theory about not localizing blocks>> is false.>> >> Weihua - why do you think your throughput doubled when you went from>> user+month to month+user keys? Are your queries using an even distribution>> of months? I'm not exactly clear on your schema or query pattern.>> >> >> On Thu, May 19, 2011 at 8:39 AM, Joey Echeverria <[EMAIL PROTECTED]> wrote:>> >>> I'm surprised the major compactions didn't balance the cluster better.>>> I wonder if you've stumbled upon a bug in HBase that's causing it to>>> leak old HFiles.>>> >>> Is the total amount of data in HDFS what you expect?>>> >>> -Joey>>> >>> On Thu, May 19, 2011 at 8:35 AM, Matt Corgan <[EMAIL PROTECTED]> wrote:>>>> that's right>>>> >>>> >>>> On Thu, May 19, 2011 at 8:23 AM, Joey Echeverria <[EMAIL PROTECTED]>>>> wrote:>>>> >>>>> Am I right to assume that all of your data is in HBase, ie you don't>>>>> keep anything in just HDFS files?

> Ok.> This why I asked you earlier about how you were generating your user ids.> > You're not going to get a good distribution.> > First, random numbers usually aren't that random.> > How many users do you want to simulate?> Try this...> Create n number of type 5 uuids. These are uuids that have been generated, then hashed using a SHA-1hashing algo, and then truncated to the right number of bits.> > This will give you a more realistic random distribution of user ids. Note that you will have to remember the user ids! It will also be alpha numeric. > Then you can use your 'month' as part of your key. However... I have to question your design again. Your billing by months means that you will only have 12 months of data and the data generation really isn't random. Meaning you don't generate your data out of sequence.> > Just a suggestion... It sounds like you're trying to simulate queries where users get created mid stream and don't always stick around. So when you create a user, you can also simulate his start/join date and his end date and then generate his 'billing' information. I would suggest that instead of using a random number for billing month that you actually create your own time stamp...> > I am also assuming that you are generating the data first and then running queries against a static data set?> > If this is true, and you create both the uuids and then the billing data, you'll get a better random data set that is going to be more realistic...> > Having said all of this...> > You have a couple of options..> > First you can make your key month+userid, assuming you only have 12 months of data.> Or you can make your key userid+month. This has the additional benefit of collocating your user's data.> > Or you could choose a third option....> You are trying to retrieve a user's billing data. This could be an object. So you could store the bill as a column in a table where the column id is the timestamp of the bill.> > If you want the last date first, you can do a simple trick... If you are using months... make the column id 99 - the month so that your data is in reverse order. > > Sent from a remote device. Please excuse any typos...> > Mike Segel> > On May 19, 2011, at 7:08 PM, Weihua JIANG <[EMAIL PROTECTED]> wrote:> >> Sorry for missing the background.>> >> We assume user is more interested in his latest bills than his old>> bills. Thus, the query generator is worked as below:>> 1. randomly generate a number and reverse it as user id.>> 2. randomly generate a prioritied month based on the above assumpation.>> 3. ask HBase to query this user + month.>> >> Thanks>> Weihua>> >> 2011/5/20 Matt Corgan <[EMAIL PROTECTED]>:>>> I think i traced this to a bug in my compaction scheduler that would have>>> missed scheduling about half the regions, hence the 240gb vs 480gb. To>>> confirm: major compaction will always run when asked, even if the region is>>> already major compacted, the table settings haven't changed, and it was last>>> major compacted on that same server. [potential hbase optimization here for>>> clusters with many cold regions]. So my theory about not localizing blocks>>> is false.>>> >>> Weihua - why do you think your throughput doubled when you went from>>> user+month to month+user keys? Are your queries using an even distribution>>> of months? I'm not exactly clear on your schema or query pattern.>>> >>> >>> On Thu, May 19, 2011 at 8:39 AM, Joey Echeverria <[EMAIL PROTECTED]> wrote:>>> >>>> I'm surprised the major compactions didn't balance the cluster better.>>>> I wonder if you've stumbled upon a bug in HBase that's causing it to>>>> leak old HFiles.>>>> >>>> Is the total amount of data in HDFS what you expect?>>>> >>>> -Joey>>>> >>>> On Thu, May 19, 2011 at 8:35 AM, Matt Corgan <[EMAIL PROTECTED]> wrote:The information contained in this communication may be CONFIDENTIAL and is intended only for the use of the recipient(s) named above. If you are not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited. If you have received this communication in error, please notify the sender and delete/destroy the original message and any copy of it from your computer or paper files.

+

Segel, Mike 2011-05-20, 15:35

NEW: Monitor These Apps!

All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by Sematext