2015-01-29T00:42:47-07:00http://dizzyd.com/Octopress2015-01-28T23:05:00-07:00http://dizzyd.com/blog/2015/01/28/interview1Lately, it’s been popular to complain about how ineffective interviewing is as a
way of selecting good candidates. It’s decried as a waste of time, unpredictable
and “just wrong”.

Interviewing doesn’t have to be this way.

In every other part of software development, we invest massive amounts of
discussion, argument, blood, sweat and tears trying to find the perfect
reproducible process. We spend PEOPLE YEARS of time tuning our software lifecycles,
even when we don’t even know what we’re building or who we’re building it
for. Think about all the software methodologies we have, all the conferences,
the books, the acronyms. In my 17 years, I’ve learned/participated in at least 4 major
software development methodologies (and endless mixes): RUP, Agile, Scrum, Kanban.

You know how many interviewing methodologies I’ve been taught?

Zero.
Nil.
Nada.
Zilch.

Do you see the problem here?

Random actions yield random results. Interviewing people without a very
specific, repeatable process will yield worse than random results. Even
coin-flipping can yield a streak of good hires. An unreproducible interview
process will allow existing biases, personal mood swings and time of day (!!) to
influence the outcomes. It will create a self-reinforcing cycle of bad
hires. Not only that, it will waste precious time — quickly offsetting any
gains from adding staff.

Here’s where you start:

Determine up front how many interviews EVERY candidate will have.

Determine up front the goal of EACH interview.

If possible, determine the questions (and the specific wording of them) you will ask in each stage.

Write these decisions down. Develop a script. Train your team with this
script. Each time you choose to pass on a candidate, review the process in light
of the decision and see if you could have identified the problem earlier in the
process. Adjust as necessary, but try to change only one variable at a time.

In other words, treat hiring with the same rigor and focus on reproducibility as
with developing software. Be methodical. The truth is that the people you choose
to build software matter more than the development process you use. Good process
amplifies people, but no amount of software process can fix a bad hire.

]]>2014-11-24T11:59:00-07:00http://dizzyd.com/blog/2014/11/24/biasJD Maturen recently shared a 2005 paper that
provides a deft and piercing examination of gender bias in job interviews. In
particular, the study showed that people tend to redefine the criteria for
success at a job, based not on the actual strengths/weaknesses of the applicant,
but on the gender of the applicant and the cultural gender stereotype of the
job. Study participants favored male applicants for a police chief job while
women applicants were favored for a professor of women’s studies
job. Importantly, the study was able to eliminate stereotyping as the source of
bias and instead demonstrated that study participants were actually redefining
their notion of ‘what it takes’ based on the idiosyncratic credentials of the
person they wanted to hire.

In a nutshell, the participants were redefining “merit” based on their own
biases and expectations.

This result alone should be setting off alarm bells for anyone in tech who is a
hiring manager. The study suggests that when we hire, we tend to hire those
people who fit deep-seated biases for the tech industry (i.e. male) and, even
worse, we will redefine and reinterpret the merit of a given candidate based on
these biases. In addition, male hiring managers will tend to do this measurably
more than female managers, and we already have an industry full of male
managers, so this will create a self-reinforcing feedback cycle of bias and
selection.

This is no meritocracy.

You might, at this point, be shaking your head and laying claim to the belief
that the tech industry strives for objectivity so surely this analysis can’t
be correct. If we can just be objective and logical, as we love to believe we
are, the best applicants will rise to the top. Here is where the study
promptly takes your logical legs and sweeps them out from under you. It
demonstrates, quite conclusively, that the participants who judged themselves
most objective and free of bias, were in fact the MOST biased in their
reinterpretation of merit.

The more objective you believe yourself to be, the less objective you probably
are.

This is a devastating result. In an industry that prides itself on being a
meritocracy, the truth is that we can not even evaluate candidates for a job
without falling all over our internal biases. We are biased.

I am biased. I have told myself, time and again, that I am being
objective — I can’t help it if I only get a few candidates that are women! I
can’t help it if they are less qualified than everyone else! In retrospect,
there probably WERE qualified candidates whose merit I dismissed as I
rationalized my own biases under an illusory veil of objectivity.

We can not trust our own ability to be objective when it comes to hiring — we
are demonstrably incapable of it. Instead, we must rely on external rules to
help us ensure everyone gets a fair review. The study outlines two specific things that have been shown to reduce gender bias. Firstly, we must use a predefined structure for every interview — something I’ve personally been doing since I started hiring, albeit for different reasons. Secondly, we must clearly define the standards of merit for a position prior to the review of ANY candidates. Neither of these are a panacea for the problem, but they are relatively simply and effective ways to move in the right direction.

Want to know why there aren’t more women in tech? It’s because we don’t hire
them.

]]>2014-04-19T21:41:00-06:00http://dizzyd.com/blog/2014/04/19/on-relapseOn my journey through cancer, I’ve tried to always be open about what’s happening. I firmly believe in the power of open discussion to demystify and dismantle the fear of the diseases like cancer. In keeping with this belief I’d like to share a little bit more about my journey and the things I’m learning.

I’m going to provide a MASSIVELY OVERSIMPLIFIED view of what I’m facing, so please don’t call the medical geeks down on me, ok?

First off, you need to understand the type of cancer that I have — it’s called follicular lymphoma. It’s a slow-moving, indolent cancer of the lymphatic system. The lymphatic system is a crucial part of your immune system and does a bunch of interesting stuff that’s not really relevant to this discussion. The important bit here is that the primary characteristic of my type of lymphoma is that the lymph nodes fill up with long-lived cells and get bulky.

Most of the cells in your body have a timer of sorts that ensures they self-destruct after a while – this mechanism is apoptosis. This is important since it ensures that cells don’t start making bad copies of themselves after a large number of splits. Basically, it’s a method for anti-entropy at a cellular level. :) In my case, one particular type of cell (B-cell) tends to get created without this timer set. They typically hang out in the lymph nodes, so over time all the lymph nodes in my body start to get kinda bulky and impinge on other things around them. Mechanically, this can be problem, since you wind up having large lymph nodes pressing on organs and such. But, mechanical issues are not that bad (since it can take many years for the nodes to get that large), so you might be wondering why it’s a big deal at all. Annoying, yes. Dangerous? Not for a long, long while.

The danger comes in the form of statistics. If you have enough long-lived cells hanging out, there will come a time where they start making transcription errors. With enough time and enough cells, you will get some really BAD transcription errors. Thus we get “transformation” – where the slow-moving, lazy lymphoma suddenly starts churning out highly-aggressive, consume-your-body-in-months lymphoma. Obviously, we’d like to avoid that situation.

Thus far, you might be thinking that this is a straightforward problem. Just use some chemo to clear out those bad B-cells before they ever get bulky and statistically significant. Easy, right?

Well, not exactly. Problem is that most times you don’t even know you’ve got a problem until the disease is widespread. In my case, it was stage 4 (i.e. every lymph node in my body had significant disease, including my bone marrow). Stage 4 follicular lymphoma is currently considered incurable. There are enough bad B-cells floating around that you’re not going to get them all and at some point, it will come back.

So, here we are. When I was first diagnosed, I was stage 4 and we needed to take immediate action. We used one of the biggest guns (chemo-wise) and it worked wonders. By my second (of six) treatments, I could no longer feel the lymph nodes all over my body and after three treatments they could find no evidence of disease using the PET scan. In other words, chemo worked like a champ. However, as expected, it didn’t get them all and now 4 years later, I’m slowly starting to accumulate those B-cells again. For the moment it’s just stage 1 — 3 nodes in a single area of my body. It is slowly progressing and the odds are that I will need to do another course of treatment within the next year. While it’s possible I could go longer — FL can stop progressing for a while or even regress — it’s not likely or predictable.

The challenge I now face is knowing when to do treatment. The natural progression of this disease is that each time you do a treatment you will get remission, but it will come back typically after a smaller interval each time. Thus, we want to delay treatment as long as possible, without incurring significant risk of transformation. Treatment works better when there is a lesser disease burden though, so that also has to be factored in. There is also the question of secondary risks of treatment. You typically can’t repeat the same treatment multiple times. If I re-used the treatment from my first battle, my heart would literally stop. The chemo in the second round has a non-zero chance of triggering secondary cancers such as multiple myleoma or acute leukemia. Later rounds of treatment have up to a 10% chance of mortality just from exposure to the treatment.

Adding to the complexities of my case is that I’m very young for this disease. On average, one is diagnosed with this disease in their late 50’s / early 60’s. Most of the treatment and research shoots to get you 20 years of survival after diagnosis. Being that I was diagnosed in my early 30’s, I’d like to figure out how to double that survival time. That’s non-trivial, to put it mildly.

As you can see, it’s a very, very tricky situation. I have the luxury of time to think and plan, but I’m playing a very long game — hopefully another 20 years at least! I’m in no immediate danger for the moment. There is a very small chance that those nodes which are growing could trigger a transformation, but it’s quite small. The hardest part is the mental strain. Realizing you have a set shelf life is not something you typically face at this age. Facing it the first time was incredibly difficult — I don’t know that facing it the second time has been much easier. I will say that it’s a pretty good kick in the pants to deal with those parts of your life where you are not happy or productive. It also encourages you to make the most of every moment, to be free with encouragement, sparing with criticism and most of all to pay attention to every breath.

This has turned into quite the entry. I hope it explains some of the challenges I face and clarifies my “I’m not dead yet!” quote. I have gotten so many emails and tweets of support — I am so grateful for them all. The best treatment for mental anguish is the kind and encouraging words of friends. :)

]]>2014-04-08T21:18:00-06:00http://dizzyd.com/blog/2014/04/08/forking-the-futureHave you ever tried to convey an earnest emotion to someone, only to have them understand you to be saying the exact opposite of what you mean?

I have. It sucks.

Communication is funny thing. Context matters. Timing matters. Framing matters. Miss or miscalculate any of these things and the communication gets lost in its entirety. The focus shifts from what you are actually trying to communicate to the missing parts of the equation.

The curious thing about this, is that in the process of the miscommunication you can actually change the future in immutable ways. You foment feelings of mistrust, anger, hurt despite your best intentions. That damages friendships and leaves people feeling like you’re a moron (which arguably you are). There is no undoing this hurt, pain, mistrust — you’re stuck with it. The words can’t be recalled. The future has changed and now you have to live with it.

I did this tonight.

I opened my mouth and clumsily tried to put into words some things that have been weighing on my heart for a while. In the process, I shattered a friendship, one that has come to mean the world to me. I hurt this person so badly, despite my desire to share something meaningful. I screwed up the timing, I didn’t consider the framing. Pretty much every part of communication that is out there, I hosed it up.

I’m sorry I hurt you.

I don’t expect things to ever go back to how they were. I know I’ve created a fork of the future with my clumsiness and I can’t even describe the amount of pain I feel having done this. This future is lonely and feels damn bleak.

I’m sorry.

]]>2014-04-03T00:36:00-06:00http://dizzyd.com/blog/2014/04/03/pawpawMost people have some sort of name for their grandfathers — mine is “PawPaw”. He has been in my life for as long as I can remember, a big man and a big presence. He stands about 6 foot, 4 inches and has always been at least a little overweight. His real name is Bernard Newton, although most of his friends call him “Yank” — he moved from New York down to Savannah, GA when he was kid. I guess his accent was pretty foreign to the kids around him. Curiously, he met my grandmother when they were both kids and she still calls him “Yank” when she gets upset or excited.

PawPaw was 82 and recently diagnosed with stage 4 lung cancer.

For most of my life, up until college, I knew my grandfather as, well, a grandfather. He was always a bit mysterious — big, strong and loud. I knew he loved me, he’d give us great gifts at Christmas and birthdays. But, really there was a bit of distance there, he was a figure, not a person. This all changed when I wound up moving in with my grandparents after my first year of college.

If you want to get to know someone, probably the best way to do it is to live with them. It pretty quickly strips away any mystique as you see them all hours of the day, in sickness and health, etc. Getting to know PawPaw as a person, not a grandfather, was never dull. He spent 20 years in the Air Force as a NCO, spanning from World War 2 through the Vietnam War. After retiring, he then took a job with Delta Airlines and did ANOTHER 20 years of service there. By the end of his career he could literally take apart just about any flying thing from a jet engine to a helicopter and put it back together. He was brilliantly smart…and boy did he have a LOT of ribald jokes. Along with the jokes, though, he had a lot of wisdom. He taught me everything from how to negotiate for and buy a car to how to build up credit to what’s appropriate to drink when you’re late to a party and need to catch up. I’ve not used that last bit of advice very much, but the other things have been very useful.

I think, though, that the most important thing I learned from PawPaw was how a man carries himself when interacting with others. He is unfailingly polite, even when he might not like the person, he stands tall, he speaks softly but firmly. For me, he embodied this idea of strength — restrained, controlled.

PawPaw passed away this morning, quietly and comfortably.

We are all broken — spectacularly shattered humans. PawPaw was no different. I saw him argue with his wife, be petty, self-centered and prideful. But, I am a better man for having known him as a person, in all his beautiful brokenness.

There are a lot of things going on around you — but your innocence shields you from all of it. You don’t feel the spectre of Death I see upon the horizon, nor the anger and heartbreak which pounds through my mind, relentless. You don’t know the fear that haunts me, the fear that I will waste these few breaths I have, that I will not reach far enough, hard enough in these dark days.

I rock and you rest.

The prospect of Death has a funny way of reminding us why we live. We sit up and look around, suddenly awake to the fact that change is coming, whether we want it or not. There are times that this raucous alarm is an inspiration to live well, deeply. To seize the day and in so doing, grab the very fabric of Time. Other times, like today, this awakening serves only to overwhelm me and leave me once again impressed with inevitability of impermanence.

I rock and you dream.

I wish there was a way to keep you from this change and heartbreak. I would hold you static against these ravages, caught in a perfect state of bliss and innocence. And yet…if I hid you from these times, from this pain, I would also withold from you the joy of attraction and friendship. There would be no meals with laughter, no firsts, no sighs of contentment. No pain, but also no relief.

You rock and I weep.

]]>2014-03-21T10:57:00-06:00http://dizzyd.com/blog/2014/03/21/emotionEmotion surfs the waves of my mind,
Taking my breath away with each crest.
I have won and lost love
Tossed by these variant currents.
I would be a automaton — if so asked
with sturdy legs and cold heart
impervious, to soaking spray and gray surf.
]]>2014-03-20T13:45:00-06:00http://dizzyd.com/blog/2014/03/20/duty-and-engagementI was chatting with a good friend the other day, who is struggling with low morale after a typical turn of startup events and they kept saying things like:

“It’s just work. I just need to get it done.”

and:

“I shouldn’t get so attached to this project.”

This line of detached thinking seems to be common when morale is low. I believe it to be a coping mechanism that allows normally passionate and creative people to rationalize a sudden loss of motivation by discounting the importance of their engagement in the problem-solving process.

The problem, of course, is that this is not just a rationalization — it’s a big fat lie. Effective problem solving requires both duty and engagement; both components are equally important. Work that is done strictly out of a sense of duty will inevitably be incomplete and error-prone. Work done simply because it looks like “fun”, but without a sense of duty, will be unbalanced and impractical.

We are well beyond the days where workers are fully replaceable cogs in a machine — knowledge workers draw upon both their intellect and emotion to generate effective solutions. As managers of these “human resources”, we must strive to ensure these dueling forces of duty and engagement remain balanced if we hope to maintain morale in uncertain times.

]]>2010-10-16T00:00:00-06:00http://dizzyd.com/blog/2010/10/16/10-minutesI’ve started making an effort to spend 10 minutes reading and 10 minutes writing every day. Sometimes, the writing portion takes the form of a blog entry, sometimes a journal entry — on occasion it’s a letter to a friend. The point of it all is to practice expressing myself and expand my view on the world. Cancer, my inspiring companion, has revealed the multi-mindedness with which I pursue life to be flat and uninteresting on larger scopes.

Ten minutes doesn’t seem like much of a commitment on the surface. But in my world of 2 kids, always-connected IM and other distractions, I actually found the first week of the regime quite difficult. I’m used to hoping topics quickly and doing a lot of skimming. Writing emails and IMs is often very close to stream of consciousness and only occasionally requires a lot of focused thought. It seems like I’ve lost touch with the ability to put words to paper/screen efficiently and effectively. What’s worse — I’ve lost touch with the ability to absorb other topics, so ruthless is my drive to be expert in my field.

So, I’m trying this 10 minute regime to try and recover the skills that I’ve lost. Before I picked up programming, I loved to write; poems and short stories were great creative outlets for me. Perhaps I can recover these delights, or at least, take back the ability to communicate in the ambiguous but graceful grammar of humans.

]]>2010-02-18T00:00:00-07:00http://dizzyd.com/blog/2010/02/18/cancerIt’s 1.44 am. Woke up feeling weird; then my mind went running, afraid of what it might find.

I was diagnosed with follicular lymphoma three weeks ago now.

I’m blessed in a lot of ways. The cancer is slow moving, non aggressive — or so it appears at this point. I might not even require treatment in the near future. Even if I do require treatment, survival rates have jumped from 60% to 90% in the past five years — the treatment for this cancer is progressing quickly. My company, Basho, has been wonderful to me in terms of helping me sort out a variety of insurance issues and arranging access to very good doctors.

All of these things are probably the reason I’ve not had any trouble sleeping until tonight.

It’s still scary though. Cancer — just the word inspires fear when you first hear it. You are struck, relatively quickly, with the fragility and preciousness of life. You suddenly have a deep desire to grow old. The prospect of death is a powerful incentive to live.

I cried more the first few days and weeks than I ever have in my 32 years. I cried because I was scared. I cried because I was worried about my wife, our 2 year old and the new baby on the way. I cried because it felt unfair, unwarranted! I cried because I realized that there were some areas of my life that I had wasted — and I wondered if I would have the chance to rectify them.

As I’ve gotten further into this process, emotions have settled out a bit. I realize now just how good I have it with this cancer. What I’m facing is absolutely nothing compared to other people I know with chronic medical conditions. It’s a smudge on the screen; a minor distraction. There might be some tough times ahead, but my overall probability for immediate mortality is relatively stable and low.

That said, I’m determined to make the most of this challenge. If I must go through this valley, I’m going to extract every bit of growth from it that I can. I choose to grow, to push my boundaries in every dimension: physically, spiritually, mentally, emotionally. I choose to spend more time with my family and less time with wandering the mental spaces of coding. I choose to listen more and speak less. I choose to be grateful that all of these realizations have been granted to me at 32 instead of 64.

It’s now 2.21 am. I think it was just the Chinese food from dinner that woke me up.

]]>2010-01-10T00:00:00-07:00http://dizzyd.com/blog/2010/01/10/rebarOver the past two months, I’ve been busy taking the lessons learned from erlbox and designing a pure Erlang build tool called rebar. While erlbox is a very complete toolkit of rake functions for building Erlang code, it has a couple of significant problems. First off, the external dependency on rake is often a significant problem for developers who are not conversant in Ruby. While anyone can learn Ruby, if you’re an Erlang developer you likely have other tasks to attend to than learning a language solely for the purpose of maintaining your build system. The other significant problem with erlbox is that it spends a lot of time going in/out of Erlang to do “Erlangy” sorts of checks — like parsing/validating the .app file, running eunit, etc. This leads to erlbox being a relatively slow build system, not to mention a little awkward to maintain since it was an odd mix of Ruby and invocations of Erlang.

Thus, rebar was born. As a strictly Erlang implementation, it’s possible for Erlang developers to dig into it and improve/modify with minimal effort. It’s also wickedly fast, since it starts the VM up only once and has direct access to all the tools one needs to build and validate Erlang code. It has the added advantage of being able to take advantage of Erlang’s inherent parallelism, so where possible, it runs commands concurrently. Finally, it’s designed to be a self-contained escript, so using rebar doesn’t introduce any build dependencies other than a stock Erlang install. You simply drop the rebar script into your code tree and go!

You can see a demonstration of converting an existing app to rebar here.

Create and compile a simple OTP application by doing the following steps on a terminal:

Documentation is still scarce — that’s something I’m going to be working on over the next few weeks. The core pieces of rebar are mostly at a point that I’m happy with; now it’s time to polish. :)

If you have questions about rebar, or especially feedback after using it IRL, please ping me on Freenode IRC — I’m typically in the #riak room.

]]>2009-12-18T00:00:00-07:00http://dizzyd.com/blog/2009/12/18/running-erl-in-a-debuggerLet’s say you need to debug a port driver in Erlang. This typically involves gdb (unless you prefer the printf route). Go to where erlang is installed and edit the bin/erl script. Change the last line from:

]]>2009-11-03T00:00:00-07:00http://dizzyd.com/blog/2009/11/03/further-thoughts-on-dynamos-flawed-architectureMr. Sarma revisits his claims that Dynamo is a universally “flawed architecture”. I certainly concur that Dynamo has its flaws, but making sweeping claims about something being universally so is to under-value the contribution to production thinking that Dynamo contributes. So, once again, I’m going to take a few choice quotes from Mr. Sarma and respond to them.

However, i remain convinced that one should not force clients to deal with stale reads in
environments where they can be avoided. As i have mentioned in the updated initial post - there
are simple examples where stale reads cause havoc. One may not be able to do conflict
resolution or the reads can affect other keys in ways that are hard to fix later.

Arguing applications “may not be able to do conflict resolution” is non-sensical — by definition, Dynamo requires that the application be cognizant of conflict resolution! This isn’t an arbitrary decision to make clients aware of conflicts. It’s a part of a measured approach to building a robust system. One may not agree with it, but to claim that Dynamo is universally flawed just because it does not conform with one’s personal feeling about layering is dis-ingenous at best.

Please understand me, I make no claim that Dynamo is the end-all-be-all for data stores. It is a terrible, terrible choice for some problem spaces. However, if you want a low-latency, highly-robust key/value store it works quite well.

About Vector Clocks and multiple versions - it’s not a surprise that they were not
implemented in Cassandra. In Cassandra - the cost of having to retrieve many versions of a key
increases the disk seek costs reads multi-fold. Due to the usage of LSM trees, a disk seek may
be required for each file that has a version of the key. Even though the versions may not
require reconciliation, one still has to read them.

This is an argument about implementation details of Cassandra and has nothing to do with whether or not Dynamo is a universally flawed architecture. I can say from experience that vector clocks do not have to be slow — as with anything, careful implementation can yield surprisingly fast results. I would also note that in the production systems where I’ve deployed Dynamo-clones, the actual occurrence of multiple versions (or conflicts, in Dynamo terms) is quite rare. The original Dynamo paper (sect 6.3, para 3) notes that 99.94% of all requests return a single version; this matches closely with what I’ve observed in my own production deployments today (99.91%).

Also, implementation-wise, one doesn’t typically keep resolved versions lying around — the only time there are multiple versions present on disk is when a conflict has not been
resolved. One could keep old versions around, I suppose, and in that situation I agree that you would want to carefully design your store so as to avoid unnecessary seeks when reading the “current” version.

So, unfortunately, i am repeating this yet again - Dynamo’s quorum consensus
protocol seems fundamentally broken. How can one write outside the quorum group and claim a
write quorum? And when one does so - how can one get consistent reads without reading every
freaking replica all the time? (well - the answer is - one doesn’t - which is why Dynamo is
eventually consistent. I just hope that users/developers of Dynamo clones realize this now).

As Mr. Sarma astutely points out, the reason Dynamo works is because it makes no guarantees about instantaneous consistency. Assuming (again) that the client can tolerate conflicts and that the cluster will attempt to resync at the earliest possible opportunity, writing to non-authoritative nodes is perfectly fine. The system will eventually come back into consensus.

Unfortunately, I’m pretty sure that my arguments will be insufficient to convince Mr. Sarma of the utility of Dynamo. I hope, however, that anyone reading this discussion will consider that reviewing the concepts of a paper is a very different task from executing on those concepts. As someone who has successfully executed ideas from that paper, I can assure Mr. Sarma that the concepts not only work, but they work surprisingly well.

Finally, the real contribution of the Dynamo paper is the balance that was struck between performance, reliability and pragmatism in the design of a production DHT. It underscores the importance of taking nothing for granted and being willing to consider counter-intutitive solutions to hard problems.

]]>2009-11-01T00:00:00-06:00http://dizzyd.com/blog/2009/11/01/thoughts-on-dynamos-flawed-architectureIn general, I think it’s a little inflammatory to make sweeping statements about the fitness of a given architecture. Every architecture has its flaws; it’s an expected state when you are faced with diametrically opposing constraints. The real question that should be asked is whether or not an architecture solves the problems for which it was designed in a reliable and efficient manner.

Joydeep Sarma posted an entry claiming that Dynamo is a “flawed architecture”. I’m not really qualified to prove or disprove Mr. Sarma’s claim, but having implemented a Dynamo clone, I think that he may be a little confused about how things work in these systems. What follows are a few quotes from his write-up followed by my own responses.

Let’s say that one is storing key-value pairs in Dynamo - where the value encodes a ‘list’. If
Dynamo returns a stale read for a key and claims the key is missing, the application will
create a new empty list and store it back in Dynamo. This will cause the existing key to be
wiped out. Depending on how ’stale’ the read was - the data loss (due to truncation of the
list) can be catastrophic. This is clearly unacceptable. No application can accept unbounded
data loss - not even in the case of a Disaster.

Dynamo implementations protect against this scenario by using vector clocks. If we define a “stale read” as one which returns the key (or absence thereof) and an older vector clock, then any writes which use this older/non-existent vector clock will generate a conflict and the server will store two versions of the same key. The application then has the opportunity to resolve this conflict on the next read. When used in conjuction with quoroms for reads and writes, this approach proves to be exceedingly robust.

Dynamo starts by saying it’s eventually consistent - but then in Section 4.5. it claims
a quorum consensus scheme for ensuring some degree of consistency. It is hinted that by setting
the number of reads (R) and number of writes (W) to be more than the total number of replicas
(N) (ie. R+W>N) - one gets consistent data back on reads. This is flat out misleading. On close
analysis one observes that there are no barriers to joining a quorum group (for a set of
keys). Nodes may fail, miss out on many many updates and then rejoin the cluster - but are
admitted back to the quorum group without any resynchronization barrier. As a result, reading
from R copies is not sufficient to give up-to-date data.

One of the foundational assumptions in the Dynamo system is that you define as many replicas as necessary to achieve your desired level of reliability. As with any replication based system, if you lose all of your replicas, there is no meaningful recovery. However, if we assume that you will always have some number of replicas functional, and we introduce an appropriate quorum on operations, we can identify those nodes which return stale data and repair them appropriately. In other words, it’s perfectly possible not to have resync barrier on joining, yet still ensure consistency in the answers provided to the client.

It might be helpful to recall that there are three levels of repair: read-repairs, hinted handoffs and replica synchronization. Two of these three are done in near-real time, thus minimizing the actual drift between nodes. Read repair deals with stale data on a per key/operation basis; the coordinator for a request can identify nodes responding with stale data and update them accordingly, using responses from other less stale nodes. Hinted handoffs are a bulk operation that is done when a node rejoins the cluster — the keys updated while the node was down are replayed (in essence) to the rejoining node. Replica sync is something that is typically done once a day and does require a traversal of all the data for a given partition. Tricks like Merkel trees, however, permit only the changed portion of the data to be exchanged, so in practice it’s not nearly as expensive as one might imagine in the abstract.

Lack of point in time consistency at the surviving replica (that is evident in this scenario)
is very problematic for most applications. In cases where one transaction (B) populates entites
that refer to entities populated in previous transactions (A), the effect of B being applied to
the remote replica without A being applied leads to inconsistencies that applications are
typically ill equipped to handle (and doing so would make most applications complicated).

The Dynamo paper makes it very clear that applications do require more logic to deal with these situations. Yes, it’s more work for the application, but in practice, it’s not that bad. It’s also important to point out that data dependencies are handled differently in these key/value stores than they are in a typical ACID environment. Usually apps will store the data in a denormalized form, so dependencies amongst key versions are minimal (if they exist at all). This makes it much easier to deal with conflicts as all the relevant data is on hand during the resolution phase.

I’ll leave it to someone else to do a more exhaustive analysis of Mr. Samra’s arguments. It’s been my experience over the past 2 years that Dynamo is one of those systems that you really have to see in action (or implement it) to appreciate the wonderful elegance and resiliency of the design. It’s certainly not a one-size-fits-all solution, but works very well in the appropriate problem space.

]]>2009-08-19T00:00:00-06:00http://dizzyd.com/blog/2009/08/19/getting-started-with-erlboxerlbox is a set of Rake tasks that make it easy to build Erlang applications and embedded nodes. It’s a framework that Phil and I developed over the past few months and is now something I’d prefer not to live without. While it would be nice to have a “pure” Erlang solution for doing builds, Rake has turned out to be an excellent tool and a reasonable, pragmatic solution to the problem.

Please note that erlbox (and this blog entry) isn’t necessarily where you want to start when you’re first learning Erlang — see any of the excellent books for a good introduction to Erlang.

To get started with erlbox, you first need to install Ruby and RubyGems. One you have RubyGems installed, you can then do:

1

$ gem install erlbox

This should pull down erlbox as well as Rake and any other dependencies. Note that the Debian version of RubyGems is a little weird — my experience is that using RubyGems from source on Debian yields the best results.

Once you have erlbox installed, you’re ready to put together your an Erlang application. In keeping with OTP guidelines, we’ll start by creating a standard OTP directory structure:

1

$ mkdir -p testapp/ebin testapp/src

The next step is to create an application descriptor (ebin/testapp.app) so that Erlang/OTP knows how to start our application up. Drop the following text into testapp/ebin/testapp.app:

As you can see, we still have some work to do. One of the important things that erlbox does is validate the application descriptor ebin/testapp.app and ensure that all the modules it lists are present in compiled form in the ebin/ directory. In this case, the .app file claimed that a module named “testapp” would be present as ebin/testapp.beam, and erlbox generated an error when the module was not found.

So, let’s create the source for the testapp module. Drop the following text into testapp/src/testapp.erl:

The build completed cleanly. Congratulations, you’ve just built a basic Erlang/OTP application using erlbox.

There are a lot more features in erlbox other than what I’ve covered here. It also has the ability to build/compile OTP embedded nodes, SNMP MIBs, port drivers and other everyday components that make up the OTP platform.

]]>2008-12-20T00:00:00-07:00http://dizzyd.com/blog/2008/12/20/drainedThe last portion of this year has been draining on many fronts. It’s not a complaint — just a statement. I was grading two classes, taking another and working full time. It was too much. I worked through everything, but at different times had to neglect things that I would have preferred to focus on.

I am slowly recovering. Now in this slow, happy time of Christmas and New Year’s I find myself without my normal drive. I feel empty and light; it’s disturbing after the harried pace of the past 4 months. There are so many side projects that I want to work on, but simply can’t find the energy or desire to focus on them.

Over the years I’ve come to realize that the creative expenditure of creating software comes at a price. I have a fixed capacity for creating software — if I expend that capacity it requires time to refuel. In the interim, I can still create but at a much diminished pace, and typically with a much lower quality than what I am accustomed to. The best thing to do, typically, is NOT create. Wait, pause and be patient. Permit focus to drift until it’s ready to snap back to laser precision for the next Push.

This post probably sounds like nonsense — perhaps it is.

]]>2008-05-13T00:00:00-06:00http://dizzyd.com/blog/2008/05/13/gtd-and-clarityI’ve recently been bitten by the “GTD”:http://en.wikipedia.org/wiki/Getting_Things_Done bug. I’m not exactly a disorganized person — I generally do get stuff done. What attracted me to the system is the core idea of striving for clarity of thought by eliminating (brain) clutter.

I’ve always loved the feeling that I get when I lose myself to a particularly challenging or fun piece of coding. It’s that state of mind where you lose track of the passage of time and focus all your energies on turning ephemeral ideas into billions of electronic pulses. There is a clarity of thought in that state, and I would love to experience it more often.

The problem is, there is always clutter and noise. So, the logical question is, how does one eliminate these things and encourage a more constant state of clarity?

For myself, I’ve found that GTD is at least a starting point. It provides a framework on which to capture actions and ideas in a way that shunts the responsibility for tracking stuff from my brain to a more reliable store. As I’ve been consistently doing this for the past week, my list of actions/projects has grown far more rapidly than I would have ever thought. The amount of stuff that we juggle in our heads is truly prodigious — no wonder the average attention span in our society is under 3 minutes.

]]>2008-03-14T00:00:00-06:00http://dizzyd.com/blog/2008/03/14/118Digits click,
Neuron to circuit, ideas flow;
Software breathes.
]]>2008-01-22T00:00:00-07:00http://dizzyd.com/blog/2008/01/22/morningEarl grey, hot.
Morning brew,
Happy soul.
]]>2007-10-14T00:00:00-06:00http://dizzyd.com/blog/2007/10/14/an-emacs-mini-hackThere have been a whole host of changes in my life since my last blog post. I left “Ping”:http://www.pingidentity.com back in August and am now working at “The Hive”:http://thehive.com. My wife and I also welcomed our first child into the world a few weeks ago. :)

At any rate, I’m now using emacs on a regular basis for editing C/C++ code and got tired of switching buffers manually between header (.h/.hpp) and implementation (.c/.cpp) files. So I hacked a little lisp for my .emacs to make life better. Maybe someone else will find this useful too..