hbase-issues mailing list archives

[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

Date

Thu, 13 Aug 2015 20:28:46 GMT

[ https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695884#comment-14695884
]
Jerry He commented on HBASE-11368:
----------------------------------
bq. but that should be guaranteed with the seqId / mvcc combination and not via region write
lock.
How? Would the following case be true without the bulk load getting the region write lock?
a. the bulk load obtain a seqId
b. a read request comes in and gets the seqId as mvcc.
c. The read will be able to see the partially loaded data while the bulk is still in process
In the 0.98 code line, we don't have seqid, and the atomicity is still guaranteed there.
bq. On bulk load, we call HStore.notifyChangedReadersObservers(), which resets the KVHeap,
but we never reset the RegionScanner from my reading of code. Is this a bug?
I think it is being propagated properly to the scanner. Think about the same notifyChangedReadersObservers
is being used at the end of compaction and flushes as well. The reset of the readers should
work.
I think the region write lock is still the only guarantee for bulk load atomicity. On the
high level, the region scan and next calls are within the region read lock, which is mutually
elusive with bulk load process which needs the region write lock. This is heavy.
> Multi-column family BulkLoad fails if compactions go on too long
> ----------------------------------------------------------------
>
> Key: HBASE-11368
> URL: https://issues.apache.org/jira/browse/HBASE-11368
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: Qiang Tian
> Attachments: hbase-11368-0.98.5.patch, hbase11368-master.patch, key_stacktrace_hbase10882.TXT,
performance_improvement_verification_98.5.patch
>
>
> Compactions take a read lock. If a multi-column family region, before bulk loading,
we want to take a write lock on the region. If the compaction takes too long, the bulk load
fails.
> Various recipes include:
> + Making smaller regions (lame)
> + [~victorunique] suggests major compacting just before bulk loading over in HBASE-10882
as a work around.
> Does the compaction need a read lock for that long? Does the bulk load need a full write
lock when multiple column families? Can we fail more gracefully at least?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)