hbase-issues mailing list archives

[jira] [Commented] (HBASE-17471) Region Seqid will be out of order in WAL if using mvccPreAssign

Date

Wed, 25 Jan 2017 05:21:27 GMT

[ https://issues.apache.org/jira/browse/HBASE-17471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15837210#comment-15837210
]
Allan Yang commented on HBASE-17471:
------------------------------------
Sorry, replay would expect some delays 'Cause Chinese spring festival vacation is coming.
But still, I'm still working on this issue.
1.
{quote}
what Yu Li said regards tests passing though there are hanging mvcc transactions; they are
covering up dodgy behavior. This suggestion of yours will help?
{quote}
Yes, It would help like I said in this [comment|https://issues.apache.org/jira/browse/HBASE-17471?focusedCommentId=15835553&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15835553].
But, there is a different opinion. [~Apache9] said {quote}If the problem only happens in UTs
then let's just modify the UTs...{quote} . So I'm not sure whether to add these code but making
code 'mess', or open another issue to modify those problematic UTs and HBASE-17506 also.
[~stack], please give me your advice.
2.
As for [~tedyu]'s advice, I will include them in the next patch, and upload it soon
3.
{quote}
Stamping sequenceid into Cells could be done in the constructor rather than in stampRegionSequenceId.
nit: You want to do more cleanup here?
{quote}
Yes, I want to do it in a new issue, but not this one. Since I have a sense that, we actually
don't need to stamp mvcc/seqid to cells in the wal endits. We only need to stamp them to cells
in the memstore. I want to open an new issue to discuss later. So, let's keep it in this patch
4.
{quote}
Let me run ITBLL with chaos on this little cluster for a while with the patch in place. That'll
test some that all is working as expected. I'll be back.
{quote}
Yes, please verify this patch with ITBLL , and thank you very much, [~stack]!
5.
{quote}
And it seems still some efforts to take for branch-1 patch. IMO it's necessary to check the
perf data for branch-1 since the write path there is different from 2.0 (append wal ->
write memstore -> sync wal v.s. append wal -> sync wal -> write memstore). Thanks.
{quote}
As we tested this patch in our custom HBase-1.1.2, there is no regression either. The order
of steps in {{doMiniBatchMutation}} will not influence the mvcc assign. But still, if I have
time, I will post data on branch-1. [~carp84]
6.
{quote}
Are you going to attach patch for branch-1 ?
{quote}
Sure, but there may be some delay, sorry for that.
> Region Seqid will be out of order in WAL if using mvccPreAssign
> ---------------------------------------------------------------
>
> Key: HBASE-17471
> URL: https://issues.apache.org/jira/browse/HBASE-17471
> Project: HBase
> Issue Type: Bug
> Components: wal
> Affects Versions: 2.0.0, 1.4.0
> Reporter: Allan Yang
> Assignee: Allan Yang
> Priority: Critical
> Attachments: HBASE-17471-duo.patch, HBASE-17471-duo-v1.patch, HBASE-17471-duo-v2.patch,
HBASE-17471.patch, HBASE-17471.tmp, HBASE-17471.v2.patch, HBASE-17471.v3.patch, HBASE-17471.v4.patch,
HBASE-17471.v5.patch
>
>
> mvccPreAssign was brought by HBASE-16698, which truly improved the performance of writing,
especially in ASYNC_WAL scenario. But mvccPreAssign was only used in {{doMiniBatchMutate}},
not in Increment/Append path. If Increment/Append and batch put are using against the same
region in parallel, then seqid of the same region may not monotonically increasing in the
WAL. Since one write path acquires mvcc/seqid before append, and the other acquires in the
append/sync consume thread.
> The out of order situation can easily reproduced by a simple UT, which was attached in
the attachment. I modified the code to assert on the disorder:
> {code}
> if(this.highestSequenceIds.containsKey(encodedRegionName)) {
> assert highestSequenceIds.get(encodedRegionName) < sequenceid;
> }
> {code}
> I'd like to say, If we allow disorder in WALs, then this is not a issue.
> But as far as I know, if {{highestSequenceIds}} is not properly set, some WALs may not
archive to oldWALs correctly.
> which I haven't figure out yet is that, will disorder in WAL cause data loss when recovering
from disaster? If so, then it is a big problem need to be fixed.
> I have fix this problem in our costom1.1.x branch, my solution is using mvccPreAssign
everywhere, making it un-configurable. Since mvccPreAssign it is indeed a better way than
assign seqid in the ringbuffer thread while keeping handlers waiting for it.
> If anyone think it is doable, then I will port it to branch-1 and master branch and upload
it.
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)