Cloud Hadoop? Buzzword Fiesta!

We haven’t quite jumped the shark yet, but this is going to be full of buzzwords.

Started a new gig where we’re building Dev, POC and possibly some prod clusters on AWS. Once again the first 80% of this was pretty easy. Using Cloudbreak, it’s fairly easy to create clusters. Developing new Amabari (gag) blueprints is pretty easy and they do a lot of the heavy lifting. It starts getting ugly after that.

Cloudbreak uses Consul to “discover” hosts after creating the AWS instances. There is a little black magic going on with Consul to use pseudo DNS. Add to this “Containers” and you have a pretty screwed up environment from a purist point of view. So add Kerberos to this mix and you might need some Xanax. Kerberos wants nice FQDNs for all of the hosts involved. That sorta goes against the idea of Elastic Hadoop, but we’ll burn that bridge later. Just getting Consul and Ambari to see each “node” (sometimes as a container) using consistent names is going to be interesting.

So, Kerberized, Elastic Hadoop in the Cloud with encrypted data in flight and at rest. That’s the buzzword goal. :-/