The first iteration took 30.5 seconds as expected, but the second, and all the subsequent ones, took 100 milliseconds. That’s because a Near Cache kicked in, and all these events are close to the client: are in the client’s memory.

As expected all subsequent calls do not use the server:

Keeping Near Cache in Sync

The first logical question is: ok, I brought these events into memory, but would not they become stale in case they change on the server?

Let’s check:

;; checking on the client side(get m 41)=>41

;; checking on the client side
(get m 41)
=> 41

;; on the server: changing the value of a key 41 to 42(put! m 4142)

;; on the server: changing the value of a key 41 to 42
(put! m 41 42)

;; checking again on the client side(get m 41)=>42

;; checking again on the client side
(get m 41)
=> 42

Pretty neat. Hazelcast invalidates “nearly cached” entries by broadcasting invalidation events from the cluster members. These events are fire and forget, but Hazelcast is very good at figuring out if and when these events are lost.

There are a couple of system properties that could be configured to control this behaviour:

hazelcast.invalidation.max.tolerated.miss.count: Default value is 10. If missed invalidation count is bigger than this value, relevant cached data will be made unreachable, and the new value will be populated from the source.

hazelcast.invalidation.reconciliation.interval.seconds: Default value is 60 seconds. This is a periodic task that scans cluster members periodically to compare generated invalidation events with the received ones from Near Cache.

Near Cache Preloader

In case clients are restarted all the near caches would be lost and would need to be naturally repopulated by applications / client requests.

Near Cache can be configured with a preloader that would persist all the keys from the map to disk, and would repopulate the cache using the keys from the file in case of a restart.

As per store-initial-delay-seconds config property, 60 seconds after we created a reference to this map, preloader will persist the keys into the nearCache-events.store file (filename is configurable):

This week we learned of a recent alien hacking into Earth. We have a suspect but unsure about the true source. Could be The Borg, could be Klingons, could be Cardassians. One thing is certain: we need better security and flexibility for our space missions.

We’ll start with the first line of defense: science based education. The better we are educated, the better we are equipped to make decisions, to understand the universe, to vote.

One of the greatest space exploration frontiers is the Hubble Space Telescope. The next 5 minutes we will work on configuring and bringing the telescope online keeping things secure and flexible in a process.

The Master Plan

In order to keep things simple and solid we are going to use these tools:

As you can see the initial, default mission is the “Eagle Nebula”, Hubble’s state is stored on tape, it uses a mono (vs. color) camera and has in internal server that runs on port 4242.

Another thing to notice, Hubble stores an audit/event log in a Hazelcast cluster. This cluster needs environment specific location and creds. While the location may or may not be encrypted, the creds should definitely be.

All the above of course can be, and some of them will be, overridden at startup. We are going to keep the overrides in Consul, and the creds in Vault. On Hubble startup the Consul overrides will be merged with the Hubble internal config, and the creds will be unencrypted and security read from Vault and used to connect to the Hazelcast cluster.

Environment Matters

Before configuring Hubble, let’s create and initialize the environment. As I mentioned before we would need to setup Consul, Vault and Hazelcast.

Consul and Vault

Consul will play two roles in the setup:

a “distributed configuration” service

Vault’s secret backend

Both can be easily started with docker. We’ll use cault‘s help to setup both.

The only obvious thing to override is hubble/log/hazelcast/hosts since creds need to be later overridden securely at runtime, as well as the hubble/log/auth-token. In fact if you look into Consul, you would see neither creds nor the auth token.

The less obvious thing to override is the hubble/vault/url. We need this, so Hubble knows where Vault lives once it needs to read and decrypt creds at runtime.

We will also override hubble/log/enabled to enable Hubble event logging.

So let’ override these in Consul:

hubble/log/hazelcast/hosts to ["127.0.0.1"]

hubble/vault/url to http://127.0.0.1:8200

hubble/log/enabled to true

We can either go to the Consul UI to override these one by one, but it is easier to do it programmatically in one shot.

Envoy Extraordinary and Minister Plenipotentiary

Hubble relies on envoy to communicate with Consul, so writing a value or a map with all overrides can be done in a single go:

Secrets are best kept by people who don’t know them

Two more things to solve the puzzle are Hazelcast creds and the auth token. We know that creds are encrypted and live in Vault. In order to securely read them out we would need a token to access them. But we also do not want to expose the token to these creds, so we would ask Vault to place the creds in one of the Vault’s cubbyholes for, say 120 ms, and generate a temporary, one time use, token to access this cubbyhole. This way, once the Hubble app gets creds at runtime, this auth token did its job and can no longer be used.

cault, the one you cloned at the very beginning, has a script to generate this token. And supporting documentation on response wrapping.

We saved Hubble Hazelcast creds under secret/hubble-audit, so let’s generate this temp token for it. We need to remember the Vault’s root token from the “Vault init” step in order for cault script to work:

eda33881-5f34-cc34-806d-3e7da3906230 is the token we need, and, by default, it is going to be good for 120 ms. In order to pass it along to Hubble start, we’ll rely on cprop to merge an ENV var (could be a system property, etc.) with existing Hubble config.

In the Hubble config the token lives here:

{:hubble{:log{:auth-token"OVERRIDE ME"}}}

{:hubble {:log {:auth-token "OVERRIDE ME"}}}

So to override it we can simply export an ENV var before running the Hubble app:

Out of 472 NoSQL databases / distributed caches that are currently available, highly buzzed and scream that only their precious brand solves the world toughest problems.. There are only a few that have less screaming and more doing that I found so far:

Choosing a Drink

if I am thirsty, and I only have water available,
I'll take that, satisfy my thirst,
and come back to focusing on whatever it is I was doing

if I am thirsty, and I only have water available,
I'll take that, satisfy my thirst,
and come back to focusing on whatever it is I was doing

at the same time

if I am thirsty, and I have 472 sodas, and 253 juices, and 83 waters available
I'll take a two hour brake to choose the right one, satisfy my thirst,
and come back to focusing on whatever it is I was doing

if I am thirsty, and I have 472 sodas, and 253 juices, and 83 waters available
I'll take a two hour brake to choose the right one, satisfy my thirst,
and come back to focusing on whatever it is I was doing

It may not seem as two different experiences at first, but they are different approaches.

Especially bad is when out of those 472 sodas, #58 has such a seductive ad on a can, and it promisses you $1,000,000 prize, if only you make a sip. So you are interested ( many people drink it too ), trying it… and spitting it out immediately => now there are 471 left to go.. remaining thirst, and such a dissatisfaction

If there is CAP, there is CRACS

NoSQL is no different now days. Choosing the right data store for your task / project / problem really depends on several factors. Let’s redefine a CAP theorem (because we can), and call it CRACS instead:

Cost: Do you have spare change for several TB of RAM?

Reliability: Fault Tolerance, Availability

Amount of data: Do you operate Megabytes or Terabytes, or maybe Petabytes?

Consistency: Can you survive two reads @ the same millisecond return different results?

Speed: Reading, which includes various aggregates AND writing

That’s pretty much it, let me know if you have you favorite that is missing from CRACS, but at least for now, five is a good number. Of course there are other important things, like simplicity ( hello Cassandra: NOT ) and ease of development ( hello VoltDB, KDB: NOT ), and of course fun to work with, and of course great documentation, and of course great community, and of course… there are others, but above five seem to nail it.

Distributed Cache: Why Redis and Hazelcast?

Well, let’s see Redis and Hazelcast are distributed caches with an optional persistent store. They score because they do just that => CACHE, they are data structure based: e.g. List, Set, Queue.. even DistributedTask, DistributedEvent, etc. Again they are upfront with you: we do awesome CACHE => and they do. I have a good feeling about GemFire, but I have not tried it, and last time I contacted them, they did not respond, so that’s that.

NoSQL: Going Beyond RAM

See, what I really learned to dislike is “if it does not fit in RAM, I will quietly die for you” NoSQL data stores ( hello MongoDB ). And it is not just the index that should entirely fit into memory, but also the data that has not yet been fsync’ed, data that was brought back for querying, data that was fsync’ed, but still hangs around, etc..

The thing to understand when comparing Riak to MongoDB is that Riak actually writes to disk, and MongoDB writes to mmap files ( memory ). Try setting Mongo’s WriteConcern to “FSYNC_SAFE”, and now compare it to Riak => that would be a fair comparison. And having LevelDB as Riak’s backend, or even good old Bitcask, Riak will take Gold. Yes, Riak’s storage is pluggable : ).

Another thing besides that obvious Mongo “RAM problem” is JSON, I mean BSON. Don’t let this ‘B‘ fool you, a key name such as “name” will take at least 4 bytes, without even getting to the actual value for this key, which affects performance, as well as storage. Riak has protocol buffers as an alternative to JSON, which can really help with the document size. And with secondary indicies on the way, it may even prove be searchable : ).

Both Riak and MongoDB struggle with Map/Reduce: MongoDB does it in a single (!) SpiderMonkey thread, and of course it has a friendly GlobalLock that does not help, but it takes a point from Riak by having secondary indicies. But both: Mongo and Riak are in the process of rewriting their MapReduce frameworks completely, so we’ll see how it goes.

Cassandra Goodness and Awkwardness

Cassandra is however a solid piece of software, with one caveat: you have to hire DataStax guys, who are really, really, really good, by the way, but you have to pay up good money for such goodness. Otherwise, on your own, you don’t need to really have a PhD to handle Cassandra [ you only really need to have a PhD in Comp Sci, if you actually can and like to hang around in school for a couple more years, but that is a topic for another blog post ].

Cassandra documentation is somewhat good, and the code is there, the only problem is it looks and feels a bit Frankenstein: murmurhash from here, bloom filter from here, here is a sprinkle of Thrift ( really!? ), in case you want to Insert/Update, here is a “mutator”, etc.. Plus adding a node is really an afterthought => if you have TBs of data, adding a node can take days, no really => days. Querying is improving with CQL roll out in 0.8, but any kind of even simplistic aggregation requires some “Thriftiveness” [ read “Awkwardness” ].

CouchDB is Awesome, but JSON

CouchDB looks and feels like an awesome choice if the size of the data is not a problem. Same as with MongoDB, “JSON only” approach is a bit weird, why only JSON? The point of no schema is NOT that it changes ALL the time => then you have just a BLOB, content of which you cannot predict, hence index. The point is, the schema ( and rather some part of it ) may/will change, so WHY the heck do I have to carry those key names (that are VARCHARs and take space) around? Again CouchDB + alternative protocol would make it in my “Real NoSQL Dudes” list easily, as I like most things about it, especially secondary indicies, Ubuntu integration, mobile expansion, and of course Erlang, but Riak has Erlang too : )

VoltDB: With Great Speed Comes Great Responsibility

As far as speed, VoltDB would probably leave most of others behind (well, besides OneTick and KDB), but it actually is not NoSQL (hence it is fully ACID), and it is not NotJustInRam store, since it IS Just RAM. Three huge caveats are:

1. Everything is a precoded store procedure => ad hoc querying is quite difficult
2. Aggregation queries can’t work with data greater than 100MB ( for a temp table )
3. Data “can” be exported to Hadoop, but there is no real integration (yet) to analyze the data that is currently in RAM along with the data in Hadoop.

But it is young and worth mentioning, as I think RAM and hardware gets cheaper, “commodity nodes” become less important, so “Scale Up” solutions may actually win back some architectures. There is of course a question of Fault Tolerance, which Erlang / AKKA based systems will solve a lot better than any “Scale Up”s, but it is a topic for another time.

Dark Hourses of NoSQL

There are others, such as Tokyo Cabinet ( or is it Kyoto Cabinet now days ), Project Voldemort => I have not tried them, but have heard good stories about them. The only problem I see with these “dark horse” solutions is lack of adoption.

Neo4j: Graph it Like You Mean It

Ok, so why Neo4j? Well, because it is a graph data store, that screws with your mind a bit (data modeling) until you get it. But once you get it, there is no excuse not to use it, especially when you create your next “social network, location based, shopping smart” start up => modeling connected things as a graph just MAKES SENSE, and Neo4j is perfect for it.

You know how fast it is to find “all” connections for a single node in a non graph data store? Well, it is longer and longer with “all” being a big number. With Neo4j, it is as fast as to find a single connection => cause it is a graph, it’s a natural data structure to work with graph shaped data. It comes with a price it is not as easy to Scale Out, at least for free. But.. it is fully transactional ( even JTA ), it persists things to disk, it is baked into Spring Data. Try it, makes your brain do a couple of front splits, but your mind feels really stretched afterwards.

The Future is 600 to 1

There is of course HBase, which Hadapt guys are promising to beat 600 to 1, so we’ll see what Hadapt brings to the table. Meanwhile I invite Riak, Redis, Hazecast and Neo4j to walk together with me the slippery slope of NoSQL.