Monday, November 9, 2015

query HBase table from Hive

It is old news that you can query HBase from Hive. There are steps
to do that listed here, here and here. What was news to me was that you
can now query HBase and expose the row timestamp in Hive. This feature is
included in 1.1.0 release of Apache Hive as part of this Jira. There’s more work to be done in exposing a cell’s timestamp
and you can watch the progress of that as part of the following ticket, but for the sake of this post,
I’d like to demo the first option.

Assuming you have a table in HBase called jsontable, with rowKey
as string, column family called cf:json. The following gist shows the Hive create statement.

Once you create the table in Hive, you can query the table and
view the row timestamp in Hive.