Re: Image store in HDFS and reiterative metadata from HBase?

I tried option 1 above and found that images link in Hue is not showing as hyperlink even external url like google.com is also appearing a plain text. Does it required any setting on Hue. See my result in below Image:-

Re: Image store in HDFS and reiterative metadata from HBase?

May be things have changed now but Hue was not meant for production use. I can also understand that internally you might want to use HUE to provide an interface for your business.

I like your approach number 2 but Phoenix is just a SQL tool. Are you bulding your own interface to run Phoenix over HBase under the hood?

Three years ago I was working on a HBase application which would store emails. Emails were stored in HBase cell directly but their attachments, which could be as much as 25 GB would be stored in HDFS. The mechanism was very similar to what you are thinking but instead of Phoenix, we had the rest api and obviously a custom email interface in which when a user clicks an attachment, the request goes to HDFS to pull the attachment an display.

Re: Image store in HDFS and reiterative metadata from HBase?

Sorry, I don't have code but you definitely have the right approach and the code part should be easy. It's the scaling and architecture that you need to get right. Also, I would personally avoid using MOB's to make architecture more easily scalable. You can always change that to use MOBs in a future release when you have seen lot more successful use cases in the industry.

Whether the content itself should go in HBase or HDFS directly depends on content size. HBase now has medium object support, which means content up to a few MB is fine, particularly if you store the metadata and actual content in separate column families.

On the UI front, if you have files stored in HDFS, you can use string concatenation to embed the filename in a WebHDFS url: <a href="http://<HOST>:<PORT>/webhdfs/v1/user/dev/images/img1.gif?op=OPEN">Link</a>, which will download as a file when clicked. Note, I've done this in Zeppelin, but haven't tried it in the Hive View or in Hue.

If you're accessing content from HBase, you'll need a service to front HTTP calls. The Phoenix Query Server may make this possible out of the box, but I haven't tried.