introduces several callQueue improvements, which can increase performance. See the JIRA for some benchmarking information

"Improvements" seems something like "on by default". We don't have anything on by default.
Is more like "new options to experiments with tunings".

"ipc.server.callqueue." There was a jira that you documented that the options were renamed in "hbase.ipc..."

for read.share see HBASE-11724 (in-progress), apparently the doc with 0, 0.5 and 1 is not clear enough.

overall this doc doesn't seems to add any value to what is already in hbase-default.xml
I think that the doc should provide more detailed information on why increasing that number is good or bad, what will be the result and so on. I'll try to come up with something for you.

Matteo Bertozzi
added a comment - 14/Aug/14 08:48 introduces several callQueue improvements, which can increase performance. See the JIRA for some benchmarking information
"Improvements" seems something like "on by default". We don't have anything on by default.
Is more like "new options to experiments with tunings".
"ipc.server.callqueue." There was a jira that you documented that the options were renamed in "hbase.ipc..."
for read.share see HBASE-11724 (in-progress), apparently the doc with 0, 0.5 and 1 is not clear enough.
overall this doc doesn't seems to add any value to what is already in hbase-default.xml
I think that the doc should provide more detailed information on why increasing that number is good or bad, what will be the result and so on. I'll try to come up with something for you.

ipc.server.callqueue.handler.factor
A value between <literal>0</literal> and <literal>1</literal> gives each handler
a percentage of a queue. For instance, a value of <literal>.5</literal> shares one
queue between each two handlers.</para>

Is this correct? I mean the example is correct, but "gives each handler a percentage of a queue"
to me sounds like the other way around where 0 means share nothing and 1 share all.
buy maybe is just me not able to read it correctly.

You can also add that the benefit of having multiple queues (e.g. 1 per handler) means that there is less contention when the task is added to/select from the queue, which result is better performance.
but it also means that if you have 2 queues, and 1 ends up with task that takes long you end up with one handler waiting to receive the next call instead of executing the pending ones in the other queue.

read.share was renamed to read.ratio (no need to doc the change seen no release was released with .share)
I've added also more examples after a discussion with jon, which you should include.

The specified interval (which should be between 0.0 and 1.0) will be multiplied by the number of call queues.
A value of 0 indicate to not split the call queues, meaning that both read and write requests will be pushed to the same set of queues.
A value lower than 0.5 means that there will be less read queues than write queues.
A value of 0.5 means there will be the same number of read and write queues.
A value greater than 0.5 means that there will be more read queues than write queues.
A value of 1.0 means that all the queues except one are used to dispatch read requests.
Example: Given the total number of call queues being 10
a read.ratio of 0 means that: the 10 queues will contain both read/write requests.
a read.ratio of 0.3 means that: 3 queues will contain only read requests and 7 queues will contain only write requests.
a read.ratio of 0.5 means that: 5 queues will contain only read requests and 5 queues will contain only write requests.
a read.ratio of 0.8 means that: 8 queues will contain only read requests and 2 queues will contain only write requests.
a read.ratio of 1 means that: 9 queues will contain only read requests and 1 queues will contain only write requests.

Also, add something like: separating the number of read/write queues can be used to "prioritize" read vs writes, less queues you have more "throttling" you have on that operation.
but separating read and write queues also means that reads will never be stuck waiting a write operation to complete. (dumb example, 2 handler 1 queue you have a seq of WRITE, WRITE, READ the read must wait the writes to complete, if you have the 2 separate queue and 1 handler is processing only the write queue and the other only the read queue at any point in time you are executing a read and a write)

There is also a new scan.ratio property that splits the read queues in long-read and short-read

the scan.ratio property will split the read call queues into small-read and long-read queues.
A value lower than 0.5 means that there will be less long-read queues than short-read queues.
A value of 0.5 means that there will be the same number of short-read and long-read queues.
A value greater than 0.5 means that there will be more long-read queues than short-read queues
A value of 0 or 1 indicate to use the same set of queues for gets and scans.
Example: Given the total number of read call queues being 8
a scan.ratio of 0 or 1 means that: 8 queues will contain both long and short read requests.
a scan.ratio of 0.3 means that: 2 queues will contain only long-read requests and 6 queues will contain only short-read requests.
a scan.ratio of 0.5 means that: 4 queues will contain only long-read requests and 4 queues will contain only short-read requests.
a scan.ratio of 0.8 means that: 6 queues will contain only long-read requests and 2 queues will contain only short-read requests.

and again, by dividing long-reads from short-reads you can "prioritize" what you need.
(same stuff as read/write but with long/short reads)

said that, this property are meant mainly for perf testing, unless you really know what you are doing
since they "fixed" for the RS and if you want to change them you have to restart the RS.
The idea is to have them dynamically configurable by user/table/namespace once we have quotas
and maybe at some point autotunables based on the workload stats.

Matteo Bertozzi
added a comment - 14/Aug/14 20:22
ipc.server.callqueue.handler.factor
A value between <literal>0</literal> and <literal>1</literal> gives each handler
a percentage of a queue. For instance, a value of <literal>.5</literal> shares one
queue between each two handlers.</para>
Is this correct? I mean the example is correct, but "gives each handler a percentage of a queue"
to me sounds like the other way around where 0 means share nothing and 1 share all.
buy maybe is just me not able to read it correctly.
You can also add that the benefit of having multiple queues (e.g. 1 per handler) means that there is less contention when the task is added to/select from the queue, which result is better performance.
but it also means that if you have 2 queues, and 1 ends up with task that takes long you end up with one handler waiting to receive the next call instead of executing the pending ones in the other queue.
read.share was renamed to read.ratio (no need to doc the change seen no release was released with .share)
I've added also more examples after a discussion with jon, which you should include.
The specified interval (which should be between 0.0 and 1.0) will be multiplied by the number of call queues.
A value of 0 indicate to not split the call queues, meaning that both read and write requests will be pushed to the same set of queues.
A value lower than 0.5 means that there will be less read queues than write queues.
A value of 0.5 means there will be the same number of read and write queues.
A value greater than 0.5 means that there will be more read queues than write queues.
A value of 1.0 means that all the queues except one are used to dispatch read requests.
Example: Given the total number of call queues being 10
a read.ratio of 0 means that: the 10 queues will contain both read/write requests.
a read.ratio of 0.3 means that: 3 queues will contain only read requests and 7 queues will contain only write requests.
a read.ratio of 0.5 means that: 5 queues will contain only read requests and 5 queues will contain only write requests.
a read.ratio of 0.8 means that: 8 queues will contain only read requests and 2 queues will contain only write requests.
a read.ratio of 1 means that: 9 queues will contain only read requests and 1 queues will contain only write requests.
Also, add something like: separating the number of read/write queues can be used to "prioritize" read vs writes, less queues you have more "throttling" you have on that operation.
but separating read and write queues also means that reads will never be stuck waiting a write operation to complete. (dumb example, 2 handler 1 queue you have a seq of WRITE, WRITE, READ the read must wait the writes to complete, if you have the 2 separate queue and 1 handler is processing only the write queue and the other only the read queue at any point in time you are executing a read and a write)
There is also a new scan.ratio property that splits the read queues in long-read and short-read
the scan.ratio property will split the read call queues into small-read and long-read queues.
A value lower than 0.5 means that there will be less long-read queues than short-read queues.
A value of 0.5 means that there will be the same number of short-read and long-read queues.
A value greater than 0.5 means that there will be more long-read queues than short-read queues
A value of 0 or 1 indicate to use the same set of queues for gets and scans.
Example: Given the total number of read call queues being 8
a scan.ratio of 0 or 1 means that: 8 queues will contain both long and short read requests.
a scan.ratio of 0.3 means that: 2 queues will contain only long-read requests and 6 queues will contain only short-read requests.
a scan.ratio of 0.5 means that: 4 queues will contain only long-read requests and 4 queues will contain only short-read requests.
a scan.ratio of 0.8 means that: 6 queues will contain only long-read requests and 2 queues will contain only short-read requests.
and again, by dividing long-reads from short-reads you can "prioritize" what you need.
(same stuff as read/write but with long/short reads)
said that, this property are meant mainly for perf testing, unless you really know what you are doing
since they "fixed" for the RS and if you want to change them you have to restart the RS.
The idea is to have them dynamically configurable by user/table/namespace once we have quotas
and maybe at some point autotunables based on the workload stats.

Misty Stanley-Jones
added a comment - 15/Aug/14 01:39 Thanks Matteo Bertozzi , let me know if this is better. Also if you could be sure I'm right about how hbase.ipc.server.callqueue.handler.factor works, that would be good. I'm not quite sure about it.

Matteo Bertozzi
added a comment - 19/Aug/14 22:25
hbase.ipc.server.callqueue.read.ratio
This factor weights the queues toward reads (if below .5) or writes (if above .5).
Other way around, the examples are ok except one.
A value of .6 uses 75% of the queues for writing and 25% for reading. Given a value of 10 for
hbase.ipc.server.num.callqueue, 7 queues would be used for reads and 3 for writes.</para>
some weird math in here. 0.6 should give you 60% and not 75% it is basically the reverse of the 0.3 example above which is good. and it is also 60% of reading (only the first part is wrong)
You can also split the read queues so that separate queues are used for short reads
(from Get operations) and short reads (from Scan operations)
short read (get) and long reads (scan) you have two "short" in there.

OK thanks for the clarification. I think I had a lightbulb moment and added a little more detail to explain, and also made your corrections. Sorry about getting mixed up. By the way the bad math was left over from my first attempt which used 25/50/75/100 but didn't work nicely with 10 queues.

Misty Stanley-Jones
added a comment - 20/Aug/14 01:27 OK thanks for the clarification. I think I had a lightbulb moment and added a little more detail to explain, and also made your corrections. Sorry about getting mixed up. By the way the bad math was left over from my first attempt which used 25/50/75/100 but didn't work nicely with 10 queues.