Advogato blog for krhttp://www.advogato.org/person/kr/
Advogato blog for kren-usmod_virguleSun, 2 Aug 2015 22:32:01 GMTTue, 22 Jan 2013 10:09:01 GMTInbox Zero for Lifehttp://www.advogato.org/person/kr/diary.html?start=22
http://xph.us/2013/01/22/inbox-zero-for-life.html<p>“<a href="https://twitter.com/daneharrigan/status/281153123505037312" >.@krarick’s email system works. He should blog about it</a>”<br/>— <a href="https://twitter.com/daneharrigan" >@daneharrigan</a></p>
<h2 id="do_this">Do This</h2>
<ol><li>
<p>Use the Gmail web UI with shortcut keys.</p>
</li>
<li>
<p><strong>Mute.</strong> Heavily. The <code>m</code> shortcut key is your friend.</p>
</li>
<li>
<p><strong>Triage.</strong> Your one and only goal for processing your inbox is to <em>make it empty</em>. Not to actually do anything productive, because processing email is inherently anti-productive. Don’t fool yourself into thinking you’re doing work here. Just get it over with as quickly as possible. For each message in your inbox, quickly choose one of these things:</p>
<ul><li>Mute the conversation. (Press <code>m</code>.)</li>
<li>Read (or skim) it and archive. (Press <code>e</code>.)</li>
<li>If it requires action and it’ll take you less than 30 seconds, <a href="http://www.doitfuckingnow.com/" >DIFN</a> and archive the message. (Press <code>e</code>.)</li>
<li>If it requires action and it would take you 30 seconds or more, star and archive it <em>immediately</em>. (Press <code>s</code> then <code>e</code>.)</li>
<li>It is permitted to begin replying to a message and then realize it’ll take longer than you thought. In that case you should defer the rest of your reply. (Press <code>se</code>.)</li>
</ul><p>If it takes you more than, say, three seconds to decide which thing to do, star and archive. (Press <code>se</code>.)</p>
<p>You can get rid of a hundred threads this way in five minutes. Now your inbox is empty. You’re welcome. You still have a to-do list, and it is not empty.</p>
</li>
<li>
<p><strong>Check starred items.</strong> Press <code>g</code> then <code>s</code>. Open this folder regularly. <code>gs</code>. It’s your new home. Learn to love it. <code>gs</code>. If there were a way to make this folder the one you see first when you log in, I would tell you to do that.</p>
</li>
<li>
<p>Process your inbox again! <code>gi</code>. Later, at some time in the future, when you feel emotionally centered. Maybe after you’ve completed something from your starred items and you’re feeling good about yourself. <code>gi</code>. Certainly not while you’re in the middle of doing something useful. But don’t just “check” your inbox, <em>process</em> it. If you’ve gone to the trouble of opening it up, you might as well rip through those messages (there are only like two of them now since it was empty before) and get it empty again.</p>
</li>
<li>
<p>Survey the universe. <code>gs</code> <code>gi</code>. It’s easy to bounce between these two folders. <code>gs</code> <code>gi</code>. You know, get an overview of what’s going on. Expect to have 0–4 items in each of these folders most of the time. Tending toward the 0 side. Not too bad, eh?</p>
</li>
<li>
<p><strong>Pick one thing from your starred items and do that thing.</strong></p>
</li>
</ol><p>When you’ve finished, unstar it. (Press <code>u</code> to go back to the starred items list and <code>s</code> to unstar it. The gesture to remember is <code>us</code>.)</p>
<p>That’s it, really. Looks like a lot of rules, now that I’ve typed them all out, but it’s not so bad. Just remember, <em>triage</em>.</p>
<h2 id="do_not">Do Not</h2>
<ol><li><em>Do not mark as unread.</em> To-do items are hereby banned from your inbox. Also, you don’t need that <strong>this-item-is-unread-and-the-font-is-bold</strong> visual treatment to distinguish to-do items from other mail. A star is sufficient. Your to-do list, the starred items folder, is such a tranquil place now. Namaste. (Exception: you’re looking at a message and legitimately don’t have time to read it because you must abort processing mail. Mark it as unread and leave it in the inbox with all the other messages you actually haven’t read.)</li>
<li><em>Do not use priority inbox.</em> Turn it off; you don’t need it any more. Classic inbox all the way. Those little yellow “important” markers? Turn those off too. They’re nothing but a distraction.</li>
<li><em>Do not use push notifications</em> for incoming mail on your phone. You really don’t want to be interrupted, distracted by incoming mail while you’re doing actual work. <em>You decide</em> when to process your inbox.</li>
</ol><h2 id="frequently_anticipated_questions">Frequently Anticipated Questions</h2>
<p>
<em>Keith, why are you blogging about methods for email productivity?</em>
</p>
<p>I’m not a fan of heavy process. GTD makes me squirm. I’m more of a <a href="http://www.doitfuckingnow.com/" >DIFN</a> kind of guy. But email is a problem. I get a lot of it. Some of you probably get more. Ignoring the problem didn’t work. Simply <a href="http://www.doitfuckingnow.com/" >D-ing IFN</a> isn’t good enough when I have more than one thing to do. And the instinctive, thoughtless mark-as-read-and-leave-in-the-inbox-as-my-todo-list had become a quagmire. I had an ever-growing inbox and an ever-sinking feeling in the pit of my stomach. And one day I snapped, and decided to do something about it. One word appeared in my head, unbidden, fully-formed, as if from the head of Zeus: <em>triage</em>. Everything here is an outgrowth of that seed.</p>
<p>
<em>Isn’t it bad to use email as a to-do list?</em>
</p>
<p>We all do it. Stop feeling guilty about that. Embrace it.</p>
<p>
<em>Doesn’t this just move your to-do list to another place? How does it actually help?</em>
</p>
<p>Out of sight, out of mind. Just seeing all the crap you have to do increases cognitive burden. Seeing more crap you may or may not have to do but which is simply mixed in with the crap you actually have to do pushes cognitive burden over the edge. That’s the key: your inbox mixes to-do items with legitimately unread mail, while starred items gives you a much quieter, tractable to-do list.</p>
<p>Another useful, subtle property about keeping to-do items out of your inbox: it’s totally obvious when you’re done triaging. You simply run out of mail. There’s no “oh I think I got to all the new mail here, everything that’s left is stuff I’ve seen before”. No need to <em>discern</em>.</p>
<p>
<em>How can I avoid falling into the trap of spending too much time on certain inbox messages?</em>
</p>
<p>Remember, triage. Getting to inbox zero means outpacing the inflow of mail, and that means quick decision-making. You’re a battlefield medic deciding which limbs to save and which to amputate; time is not on your side.</p>
<p>
<em>What if I already use stars for something else?</em>
</p>
<p>Then maybe this system isn’t for you. Low friction is crucial to making this work. Anything that takes more than one or two keystrokes to use is too hard to use consistently. Yes, you could use a “todo” label, even from the keyboard. But typing <code>gltodo⏎</code> (or even <code>glt⏎</code>) to view it is too slow. <a href="http://www.slideshare.net/pravinnirmal/latency-kills-by-shishir-birmiwal-presentation" >Latency kills</a>. Star is <code>s</code>, one keystroke.</p>
<p>Luckily for me, I never used to use stars for anything. I had one or two messages starred randomly for no good reason. Those are gone now. We’re going to make good, heavy use of stars here. Let no starry emotional baggage get in the way; we’re completely commandeering your stars.</p>
<p>
<em>Why can’t I just mark-as-unread in my inbox?</em>
</p>
<p>Actual to-do items get lost in the noise. Mark-as-unread makes it take more effort than necessary to reason whether a given email is something you’ve looked at and decided to put off or something you actually haven’t seen before. This makes the task of processing email <em>itself</em> a nice candidate for putting off, since it takes so long.</p>
<p>Take back the “unread” state. Mark as unread only those messages you really haven’t read. Now it means what it says.</p>
<p>
<em>Should I use a multi-inbox feature to see starred items and inbox items side-by-side in one view?</em>
</p>
<p>No. It’s a good thing that the starred items are on a separate screen from the inbox. The combined view of both at once is an anti-pattern. You’re either processing inbox mail or doing work, not both. Focus.</p>
<p>
<em>How does my smartphone fit into this?</em>
</p>
<p>I don’t know about you, but I hate typing on my phone. Hate. It. Short text messages are ok, but typing an email doesn’t work. “<a href="http://joey.hess.usesthis.com/" >I feel that my thoughts are being forced out through a straw.</a>” So I never do real work on my phone or even send replies. But <em>triage</em>, on the other hand… triage is easy and fast. Recent incarnations of the Gmail iOS app are quite decent, and, crucially, contain a mute button (sadly lacking in the otherwise-excellent <a href="http://www.sparrowmailapp.com/" >Sparrow</a>). Phone triage is an effective way to stay on top of your inbox, especially when your coworkers are as maddeningly productive as <a href="https://jobs.heroku.com/" >mine</a>.</p>
<p>
<em>“Stay on top of” my inbox? Really? That sounds awful.</em>
</p>
<p>Believe me, it’s easy, but let me clarify: <em>do not worry about staying</em> on top of your inbox. Just process mail whenever you want, every 5 minutes, every 5 hours, whatever. Don’t worry about unread email piling up. 5, 10, 20, more unread messages? <em>No sweat.</em> Processing them is easy. You can (and should) do it on your terms.</p>
<p>
<em>How do I compose a new message without being distracted by my inbox and feeling compelled to process it?</em>
</p>
<p>A tip from <a href="https://twitter.com/rwdaigle" >@rwdaigle</a>:</p>
<blockquote>
<p>I’ve been keeping my default view on Gmail the drafts (or some other easily accessible but usually empty folder) so I can switch to Gmail to do basic composing/searching without being hit in the face with new messages. Works pretty well once you get in the habit of switching to the empty folder as an ”I’m done for now” kind of gesture at the end of your processing sessions.</p>
</blockquote>
<p>The shortcut sequence to switch to the drafts folder is <code>gd</code>.</p>
<p>Also remember: it’s okay to leave unread email in your inbox. You always decide when to process it.</p>
<p>
<em>But System X already does this! Nothing new here.</em>
</p>
<p>That’s not a question, but I’ll answer it anyway. First, the other systems I’ve seen certainly share some characteristics and use some of the same psychological tricks, but they’ve all felt really complicated to me. I don’t want to have to learn a hundred rules about how to organize my note cards into projects and tasks with colors and categories and… yuck. So this system is as simple as I could make it. Most of the rules are about getting rid of stuff. (Though, looking over that list above, I still wish I could make it shorter.) And it doesn’t require any special tools, just the Gmail web UI. Second, I’m sure there are lots of productivity systems out there I’ve never heard of, so maybe there’re already some systems out there just as good as this one. Good for them.</p>
<h2 id="conclusion">Conclusion</h2>
<p>So, does all this work? I didn’t know when I started out, but figured it was worth a shot at least. My biggest worry was that I’d just move the problem to another place: instead of an ever-growing inbox I’d have an ever-growing starred items folder. But a curious thing happened. My heavy inbox burden shrank a bit after the first processing session. And it just kept shrinking. My new starred items burden feels qualitatively different. Staring at a bold to-do-list inbox makes it all too easy to throw up your hands and close the browser. After all, even if you were to pick one of those emails and take care of it, it’d probably be replaced with 3 more, like a hydra, by the time you returned to your inbox. The ground is constantly shifting under you. On the other hand, your starred items folder only ever contains things that <em>you put there</em>. You can look at it, pick one thing to do, do that thing, and look again: there is now one fewer thing! You and you alone decide when to process incoming mail and star new messages. This might seem like an academic distinction, but let me tell you, it feels worlds apart, and I actually manage to knock off my starred items at a regular pace. They don’t pile up.</p>
<p>Now, getting to inbox zero is no longer some Herculean achievement, no longer worthy of <a href="https://twitter.com/search?q=%23inboxzero" >announcing to the world</a>. After a few weeks doing this stuff I didn’t just get there, I’ve <em>stayed</em> at inbox zero virtually the entire time. And it feels pretty easy to maintain that way. I expect no inbox creep in my future. Inbox zero for life!</p>Fri, 22 Apr 2011 05:16:53 GMTIntroducing Doozerhttp://www.advogato.org/person/kr/diary.html?start=21
http://xph.us/2011/04/13/introducing-doozer.html<p>We need a consistent, highly-available data store.<sup id='fnref:1'><a href="" '#fn:1' rel='footnote'>1</a></sup></p>
<p>There is no shortage of tools that provide one of these properties or the other. For example, mysql, postgres, and redis provide consistency but not high availability; riak, couchdb, and redis cluster (when it&#8217;s ready) provide high availability but not consistency. The only freely-available tool that attempts to provide both of these is zookeeper, but zookeeper is specialized, focused on locking and server management. We need a general-purpose tool that provides the same guarantees in a clean, well-designed package.</p>
<p>That&#8217;s why I&#8217;m excited to present <a href="" 'https://github.com/ha/doozerd'>doozer</a>, a consistent, highly-available data store.</p>
<h2 id='background'>Background</h2>
<p>Soon after I started working at Heroku, Blake Mizerany got me involved in designing a &#8220;distributed init&#8221; system, something that could manage processes across multiple machine instances and recover gracefully from single instance failures and network partitions. One necessary building block for this is a network service that can provide synchronization for its clients &#8211; in other words, locking.</p>
<p>When we started building this distributed process monitor, we quickly realized that the lock service should be packaged as a separate tool. In true Unix style, it strives to do one thing well. In fact, doozer doesn&#8217;t actually provide locks directly. Its &#8220;one thing&#8221; is consistent, highly-available storage; locks are a separate thing, that clients can implement on top of doozer&#8217;s primitives. This makes doozer both simpler and more powerful, as we&#8217;ll see in a moment.</p>
<h2 id='locks_are_not_primitive_enough'>Locks Are Not Primitive Enough</h2>
<p>Other similar systems (for example Chubby<s>, Zookeeper</s><sup id='fnref:2'><a href="" '#fn:2' rel='footnote'>2</a></sup>) provide locks as a primitive operation, alongside data storage. But that requires the lock service to decide when the holder of a lock has died, in order to release the lock. In practice this means that every client that wants to obtain a lock must establish a session and send periodic heartbeat messages to the server. This works well enough for some clients, but it&#8217;s not always the most appropriate way to determine liveness. It&#8217;s particularly troublesome when dealing with off-the-shelf software that wasn&#8217;t designed to send heartbeat messages.</p>
<p>Doozer takes a different approach. It provides only data storage and a single, fundamental synchronization primitive, <em>compare-and-set</em>. This operation is complete (you can build any other synchronization operation using it), but it&#8217;s simpler than a higher-level lock, and it doesn&#8217;t require the server to have any notion of liveness of its clients.</p>
<p>In the future, doozer might ship with companion tools that provide higher-level synchronization for users that want it, but these tools will operate as regular doozer clients, and they&#8217;ll be completely optional. If you don&#8217;t need their services, just don&#8217;t run them; if you need something slightly different from what they provide, you are free to make your own.</p>
<h2 id='what_is_it_good_for'>What is it good for?</h2>
<p>How does doozer compare to other data stores out there? Redis is amazingly fast, with lots of interesting data structures to play with. HBase is amazingly scalable. Doozer isn&#8217;t particularly fast or scalable; its claim to fame is high availability.</p>
<p>Doozer is where you put the family jewels.</p>
<p>The <a href="" 'https://github.com/ha/doozerd'>doozer readme</a> has a few concrete examples, but consider this one: imagine you have three redis servers &#8211; one master and two slaves. If the master goes down, wouldn&#8217;t it be nice to promote one of the slaves to be the new master? Imagine trying to automate that promotion. You&#8217;ll have to get all the clients to agree which one to use. It doesn&#8217;t sound easy, and it&#8217;s even harder than it sounds. If there is a network partition, some of the clients might disagree about which slave to promote, and you&#8217;d wind up with the classic &#8220;split-brain&#8221; problem. If you want this promotion to work reliably, you have to use a tool like doozer, which guarantees consistency, to coordinate the fail-over.</p>
<h2 id='using_doozer'>Using Doozer</h2>
<p>We have client drivers for Go (<a href="" 'https://github.com/ha/doozer'>doozer</a>) and Ruby (<a href="" 'https://github.com/ha/fraggle'>fraggle</a>), with an Erlang driver in the works. The <a href="" 'https://github.com/ha/doozerd/blob/master/doc/proto.md'>doozer protocol documentation</a> gives the nitty-gritty of talking to doozer, but most of you will be interested in the interface provided by the driver you&#8217;re using; they each have documentation.</p>
<div class='footnotes'><hr /><ol><li id='fn:1'>
<p>The words &#8220;consistency&#8221; and &#8220;high availability&#8221; have several reasonable definitions, and people too often use them without saying exactly what they mean. When I speak of consistency, I mean absolute prevention of inconsistent writes. The phrase &#8220;eventual consistency&#8221; is a synonym for &#8220;inconsistency&#8221;. When I speak of high availability, I mean primarily the ability to face a network partition and continue providing write service to all clients transitively connected to a majority of servers. Secondarily, I mean the ability to continue providing read-only service to all clients connected to any server. <strong>Update:</strong> This is not the same as being &#8220;available&#8221; in the sense of the frequenetly-cited and almost-as-frequently-misunderstood CAP theorem. Coda Hale has a good discussion of the <a href="" 'http://codahale.com/you-cant-sacrifice-partition-tolerance/'>CAP theorem&#8217;s applicability</a> to real-world systems. In those terms, we choose consistency over availability.</p>
<a href="" '#fnref:1' rev='footnote'>&#8617;</a></li><li id='fn:2'><span><b>Updated:</b> Zookeeper treats locks similarly to doozer.</span><a href="" '#fnref:2' rev='footnote'>&#8617;</a></li></ol></div>Wed, 13 Apr 2011 23:10:15 GMTIntroducing Doozerhttp://www.advogato.org/person/kr/diary.html?start=20
http://xph.us/2011/04/13/introducing-doozer.html<p>We need a consistent, highly-available data store.<sup id='fnref:1'><a href="" '#fn:1' rel='footnote'>1</a></sup></p>
<p>There is no shortage of tools that provide one of these properties or the other. For example, mysql, postgres, and redis provide consistency but not high availability; riak, couchdb, and redis cluster (when it&#8217;s ready) provide high availability but not consistency. The only freely-available tool that attempts to provide both of these is zookeeper, but zookeeper is specialized, focused on locking and server management. We need a general-purpose tool that provides the same guarantees in a clean, well-designed package.</p>
<p>That&#8217;s why I&#8217;m excited to present <a href="" 'https://github.com/ha/doozer'>doozer</a>, a consistent, highly-available data store.</p>
<h2 id='background'>Background</h2>
<p>Soon after I started working at Heroku, Blake Mizerany got me involved in designing a &#8220;distributed init&#8221; system, something that could manage processes across multiple machine instances and recover gracefully from single instance failures and network partitions. One necessary building block for this is a network service that can provide synchronization for its clients &#8211; in other words, locking.</p>
<p>When we started building this distributed process monitor, we quickly realized that the lock service should be packaged as a separate tool. In true Unix style, it strives to do one thing well. In fact, doozer doesn&#8217;t actually provide locks directly. Its &#8220;one thing&#8221; is consistent, highly-available storage; locks are a separate thing, that clients can implement on top of doozer&#8217;s primitives. This makes doozer both simpler and more powerful, as we&#8217;ll see in a moment.</p>
<h2 id='locks_are_not_primitive_enough'>Locks Are Not Primitive Enough</h2>
<p>Other similar systems (Chubby, Zookeeper) provide locks as a primitive operation, alongside data storage. But that requires the lock service to decide when the holder of a lock has died, in order to release the lock. In practice this means that every client that wants to obtain a lock must establish a session and send periodic heartbeat messages to the server. This works well enough for some clients, but it&#8217;s not always the most appropriate way to determine liveness. It&#8217;s particularly troublesome when dealing with off-the-shelf software that wasn&#8217;t designed to send heartbeat messages.</p>
<p>Doozer takes a different approach. It provides only data storage and a single, fundamental synchronization primitive, <em>compare-and-set</em>. This operation is complete (you can build any other synchronization operation using it), but it&#8217;s simpler than a higher-level lock, and it doesn&#8217;t require the server to have any notion of liveness of its clients.</p>
<p>In the future, doozer might ship with companion tools that provide higher-level synchronization for users that want it, but these tools will operate as regular doozer clients, and they&#8217;ll be completely optional. If you don&#8217;t need their services, just don&#8217;t run them; if you need something slightly different from what they provide, you are free to make your own.</p>
<h2 id='what_is_it_good_for'>What is it good for?</h2>
<p>How does doozer compare to other data stores out there? Redis is amazingly fast, with lots of interesting data structures to play with. HBase is amazingly scalable. Doozer isn&#8217;t particularly fast or scalable; its claim to fame is high availability.</p>
<p>Doozer is where you put the family jewels.</p>
<p>The <a href="" 'https://github.com/ha/doozer'>doozer readme</a> has a few concrete examples, but consider this one: imagine you have three redis servers &#8211; one master and two slaves. If the master goes down, wouldn&#8217;t it be nice to promote one of the slaves to be the new master? Imagine trying to automate that promotion. You&#8217;ll have to get all the clients to agree which one to use. It doesn&#8217;t sound easy, and it&#8217;s even harder than it sounds. If there is a network partition, some of the clients might disagree about which slave to promote, and you&#8217;d wind up with the classic &#8220;split-brain&#8221; problem. If you want this promotion to work reliably, you have to use a tool like doozer, which guarantees consistency, to coordinate the fail-over.</p>
<h2 id='using_doozer'>Using Doozer</h2>
<p>We have client drivers for Go (<a href="" 'https://github.com/ha/doozer/tree/master/src/pkg/client'>doozer/client</a>) and Ruby (<a href="" 'https://github.com/ha/fraggle'>fraggle</a>), with an Erlang driver in the works. The <a href="" 'https://github.com/ha/doozer/blob/master/doc/proto.md'>doozer protocol documentation</a> gives the nitty-gritty of talking to doozer, but most of you will be interested in the interface provided by the driver you&#8217;re using; they each have documentation.</p>
<div class='footnotes'><hr /><ol><li id='fn:1'>
<p>The words &#8220;consistency&#8221; and &#8220;high availability&#8221; have several reasonable definitions, and people too often use them without saying exactly what they mean. When I speak of consistency, I mean absolute prevention of inconsistent writes. The phrase &#8220;eventual consistency&#8221; is a synonym for &#8220;inconsistency&#8221;. When I speak of high availability, I mean primarily the ability to face a network partition and continue providing write service to all clients transitively connected to a majority of servers. Secondarily, I mean the ability to continue providing read-only service to all clients connected to any server. <strong>Update:</strong> This is not the same as being &#8220;available&#8221; in the sense of the frequenetly-cited and almost-as-frequently-misunderstood CAP theorem. Coda Hale has a good discussion of the <a href="" 'http://codahale.com/you-cant-sacrifice-partition-tolerance/'>CAP theorem&#8217;s applicability</a> to real-world systems. In those terms, we choose consistency over availability.</p>
<a href="" '#fnref:1' rev='footnote'>&#8617;</a></li></ol></div>Wed, 13 Apr 2011 22:12:34 GMTIntroducing Doozerhttp://www.advogato.org/person/kr/diary.html?start=19
http://xph.us/2011/04/13/introducing-doozer.html<p>We need a consistent, highly-available data store.<sup id='fnref:1'><a href="" '#fn:1' rel='footnote'>1</a></sup></p>
<p>There is no shortage of tools that provide one of these properties or the other. For example, mysql, postgres, and redis provide consistency but not high availability; riak, couchdb, and redis cluster (when it&#8217;s ready) provide high availability but not consistency. The only freely-available tool that attempts to provide both of these is zookeeper, but zookeeper is specialized, focused on locking and server management. We need a general-purpose tool that provides the same guarantees in a clean, well-designed package.</p>
<p>That&#8217;s why I&#8217;m excited to present <a href="" 'https://github.com/ha/doozer'>doozer</a>, a consistent, highly-available data store.</p>
<h2 id='background'>Background</h2>
<p>Soon after I started working at Heroku, Blake Mizerany got me involved in designing a &#8220;distributed init&#8221; system, something that could manage processes across multiple machine instances and recover gracefully from single instance failures and network partitions. One necessary building block for this is a network service that can provide synchronization for its clients &#8211; in other words, locking.</p>
<p>When we started building this distributed process monitor, we quickly realized that the lock service should be packaged as a separate tool. In true Unix style, it strives to do one thing well. In fact, doozer doesn&#8217;t actually provide locks directly. Its &#8220;one thing&#8221; is consistent, highly-available storage; locks are a separate thing, that clients can implement on top of doozer&#8217;s primitives. This makes doozer both simpler and more powerful, as we&#8217;ll see in a moment.</p>
<h2 id='locks_are_not_primitive_enough'>Locks Are Not Primitive Enough</h2>
<p>Other similar systems (Chubby, Zookeeper) provide locks as a primitive operation, alongside data storage. But that requires the lock service to decide when the holder of a lock has died, in order to release the lock. In practice this means that every client that wants to obtain a lock must establish a session and send periodic heartbeat messages to the server. This works well enough for some clients, but it&#8217;s not always the most appropriate way to determine liveness. It&#8217;s particularly troublesome when dealing with off-the-shelf software that wasn&#8217;t designed to send heartbeat messages.</p>
<p>Doozer takes a different approach. It provides only data storage and a single, fundamental synchrionization primitive, <em>compare-and-set</em>. This operation is complete (you can build any other synchronization operation using it), but it&#8217;s simpler than a higher-level lock, and it doesn&#8217;t require the server to have any notion of liveness of its clients.</p>
<p>In the future, doozer might ship with companion tools that provide higher-level synchronization for users that want it, but these tools will operate as regular doozer clients, and they&#8217;ll be completely optional. If you don&#8217;t need their services, just don&#8217;t run them; if you need something slightly different from what they provide, you are free to make your own.</p>
<h2 id='what_is_it_good_for'>What is it good for?</h2>
<p>How does doozer compare to other data stores out there? Redis is amazingly fast, with lots of interesting data structures to play with. HBase is amazingly scalable. Doozer isn&#8217;t particularly fast or scalable; its claim to fame is high availability.</p>
<p>Doozer is where you put the family jewels.</p>
<p>The <a href="" 'https://github.com/ha/doozer'>doozer readme</a> has a few concrete examples, but consider this one: imagine you have three redis servers &#8211; one master and two slaves. If the master goes down, wouldn&#8217;t it be nice to promote one of the slaves to be the new master? Imagine trying to automate that promotion. You&#8217;ll have to get all the clients to agree which one to use. It doesn&#8217;t sound easy, and it&#8217;s even harder than it sounds. If there is a network partition, some of the clients might disagree about which slave to promote, and you&#8217;d wind up with the classic &#8220;split-brain&#8221; problem. If you want this promotion to work reliably, you have to use a tool like doozer, which guarantees consistency, to coordinate the fail-over.</p>
<h2 id='using_doozer'>Using Doozer</h2>
<p>We have client drivers for Go (<a href="" 'https://github.com/ha/doozer/tree/master/src/pkg/client'>doozer/client</a>) and Ruby (<a href="" 'https://github.com/ha/fraggle'>fraggle</a>), with an Erlang driver in the works. The <a href="" 'https://github.com/ha/doozer/blob/master/doc/proto.md'>doozer protocol documentation</a> gives the nitty-gritty of talking to doozer, but most of you will be interested in the interface provided by the driver you&#8217;re using; they each have documentation.</p>
<div class='footnotes'><hr /><ol><li id='fn:1'>
<p>The words &#8220;consistency&#8221; and &#8220;high availability&#8221; have several reasonable definitions, and people too often use them without saying exactly what they mean. When I speak of consistency, I mean absolute prevention of inconsistent writes. The phrase &#8220;eventual consistency&#8221; is a synonym for &#8220;inconsistency&#8221;. When I speak of high availability, I mean primarily the ability to face a network partition and continue providing write service to all clients transitively connected to a majority of servers. Secondarily, I mean the ability to continue providing read-only service to all clients connected to any server. <strong>Update:</strong> This is not the same as being &#8220;available&#8221; in the sense of the frequenetly-cited and almost-as-frequently-misunderstood CAP theorem. Coda Hale has a good discussion of the <a href="" 'http://codahale.com/you-cant-sacrifice-partition-tolerance/'>CAP theorem&#8217;s applicability</a> to real-world systems. In those terms, we choose consistency over availability.</p>
<a href="" '#fnref:1' rev='footnote'>&#8617;</a></li></ol></div>Wed, 13 Apr 2011 08:11:12 GMTIntroducing Doozerhttp://www.advogato.org/person/kr/diary.html?start=18
http://xph.us/2011/04/13/introducing-doozer.html<p>We need a consistent, highly-available data store.<sup id='fnref:1'><a href="" '#fn:1' rel='footnote'>1</a></sup></p>
<p>There is no shortage of tools that provide one of these properties or the other. For example, mysql, postgres, and redis provide consistency but not high availability; riak, couchdb, and redis cluster (when it&#8217;s ready) provide high availability but not consistency. The only freely-available tool that attempts to provide both of these is zookeeper, but zookeeper is specialized, focused on locking and server management. We need a general-purpose tool that provides the same guarantees in a clean, well-designed package.</p>
<p>That&#8217;s why I&#8217;m excited to present <a href="" 'https://github.com/ha/doozer'>doozer</a>, a consistent, highly-available data store.</p>
<h2 id='background'>Background</h2>
<p>Soon after I started working at Heroku, Blake Mizerany got me involved in designing a &#8220;distributed init&#8221; system, something that could manage processes across multiple machine instances and recover gracefully from single instance failures and network partitions. One necessary building block for this is a network service that can provide synchronization for its clients &#8211; in other words, locking.</p>
<p>When we started building this distributed process monitor, we quickly realized that the lock service should be packaged as a separate tool. In true Unix style, it strives to do one thing well. In fact, doozer doesn&#8217;t actually provide locks directly. Its &#8220;one thing&#8221; is consistent, highly-available storage; locks are a separate thing, that clients can implement on top of doozer&#8217;s primitives. This makes doozer both simpler and more powerful, as we&#8217;ll see in a moment.</p>
<h2 id='locks_are_not_primitive_enough'>Locks Are Not Primitive Enough</h2>
<p>Other similar systems (Chubby, Zookeeper) provide locks as a primitive operation, alongside data storage. But that requires the lock service to decide when the holder of a lock has died, in order to release the lock. In practice this means that every client that wants to obtain a lock must establish a session and send periodic heartbeat messages to the server. This works well enough for some clients, but it&#8217;s not always the most appropriate way to determine liveness. It&#8217;s particularly troublesome when dealing with off-the-shelf software that wasn&#8217;t designed to send heartbeat messages.</p>
<p>Doozer takes a different approach. It provides only data storage and a single, fundamental synchrionization primitive, <em>compare-and-set</em>. This operation is complete (you can build any other synchronization operation using it), but it&#8217;s simpler than a higher-level lock, and it doesn&#8217;t require the server to have any notion of liveness of its clients.</p>
<p>In the future, doozer might ship with companion tools that provide higher-level synchronization for users that want it, but these tools will operate as regular doozer clients, and they&#8217;ll be completely optional. If you don&#8217;t need their services, just don&#8217;t run them; if you need something slightly different from what they provide, you are free to make your own.</p>
<h2 id='what_is_it_good_for'>What is it good for?</h2>
<p>How does doozer compare to other data stores out there? Redis is amazingly fast, with lots of interesting data structures to play with. HBase is amazingly scalable. Doozer isn&#8217;t particularly fast or scalable; its claim to fame is high availability.</p>
<p>Doozer is where you put the family jewels.</p>
<p>The <a href="" 'https://github.com/ha/doozer'>doozer readme</a> has a few concrete examples, but consider this one: imagine you have three redis servers &#8211; one master and two slaves. If the master goes down, wouldn&#8217;t it be nice to promote one of the slaves to be the new master? Imagine trying to automate that promotion. You&#8217;ll have to get all the clients to agree which one to use. It doesn&#8217;t sound easy, and it&#8217;s even harder than it sounds. If there is a network partition, some of the clients might disagree about which slave to promote, and you&#8217;d wind up with the classic &#8220;split-brain&#8221; problem. If you want this promotion to work reliably, you have to use a tool like doozer, which guarantees consistency, to coordinate the fail-over.</p>
<h2 id='using_doozer'>Using Doozer</h2>
<p>We have client drivers for Go (<a href="" 'https://github.com/ha/doozer/tree/master/src/pkg/client'>doozer/client</a>) and Ruby (<a href="" 'https://github.com/ha/fraggle'>fraggle</a>), with an Erlang driver in the works. The <a href="" 'https://github.com/ha/doozer/blob/master/doc/proto.md'>doozer protocol documentation</a> gives the nitty-gritty of talking to doozer, but most of you will be interested in the interface provided by the driver you&#8217;re using; they each have documentation.</p>
<div class='footnotes'><hr /><ol><li id='fn:1'>
<p>The words &#8220;consistency&#8221; and &#8220;high availability&#8221; have several reasonable definitions, and people too often use them without saying exactly what they mean. When I speak of consistency, I mean absolute prevention of inconsistent writes. The phrase &#8220;eventual consistency&#8221; is a synonym for &#8220;inconsistency&#8221;. When I speak of high availability, I mean primarily the ability to face a network partition and continue providing write service to all clients transitively connected to a majority of servers. Secondarily, I mean the ability to continue providing read-only service to all clients connected to any server.</p>
<a href="" '#fnref:1' rev='footnote'>&#8617;</a></li></ol></div>Mon, 19 Jul 2010 08:10:33 GMTJoining Herokuhttp://www.advogato.org/person/kr/diary.html?start=17
http://xph.us/2010/07/18/joining-heroku.html<p>I&#8217;m happy to report that I&#8217;ll be joining the team at <a href="" 'http://heroku.com/'>Heroku</a>, starting tomorrow.</p>
<p>I&#8217;m thrilled to be able to work on such exciting technology alongside people this smart, talented, and accomplished, inside a small, fast-moving company. This is set to be an amazing learning experience. In many ways, this is my dream job.</p>
<p>This will also be the first time I work for a company where open source is so well-understood and well-integrated in the company culture. Heroku breathes open-source software, both inward and outward. I&#8217;m excited at what this means for my ability to contribute to beanstalkd and other projects in the future.</p>Sun, 27 Jun 2010 10:21:16 GMTAbout Distributed Social Networkinghttp://www.advogato.org/person/kr/diary.html?start=16
http://xph.us/2010/06/27/distributed-social-networking.html<p>So, <a href="" 'http://www.joindiaspora.com/'>Diaspora</a>, <a href="" 'http://diso-project.org/'>DiSo</a>, <a href="" 'http://appleseedproject.org/'>Appleseed</a>, and <a href="" 'http://benwerd.com/2010/06/building-a-distributed-social-network-youre-doing-it-wrong/'>a bunch of others</a>. Despite the incredible attractiveness of this solution to the problem of Facebook/Twitter centralization, I find it hard to get excited about any of these projects. They are all doing it wrong, but not exactly for <a href="" 'http://benwerd.com/2010/06/building-a-distributed-social-network-youre-doing-it-wrong/'>that reason</a>.</p>
<p>I suspect this is both easier and harder than they think:</p>
<ul>
<li><em>Harder</em> &#8211; <strong>design</strong>. People. Product. Design matters. At all levels. Details matter. At all levels.</li>
<li><em>Easier</em> &#8211; <strong>technology</strong>. Web pages, feed, pub-sub. Done.</li>
<li><em>Harder</em> &#8211; improvements, updates, protocol changes, specs, consensus.</li>
<li><em>Easier</em> &#8211; avoid specs and consensus; dictate that shit. Just leave room for add-ons.</li>
<li><em>Harder</em> &#8211; end-to-end model is broken. <a href="" 'http://tools.ietf.org/html/rfc1627'>Network 10 considered harmful</a>. Plus, end nodes (aka laptops) are not always connected.</li>
</ul>
<p>If you have opinions on this, join one of these projects or, better, start your own.</p>
<p>Yeah, there are a bunch already out there, but they mostly suck, so you have a good chance of beating them all if yours can be <a href="" 'http://jasonsantamaria.com/articles/on-good/'>excellent</a>.</p>Sun, 2 May 2010 23:08:14 GMTHow to Handle Job Failureshttp://www.advogato.org/person/kr/diary.html?start=15
http://xph.us/2010/05/02/how-to-handle-job-failures.html<p>There&#8217;s a discussion on the <a href="" 'http://groups.google.com/group/beanstalk-talk'>beanstalkd mailing list</a> right now about queue introspection and handling failures. My response got a little long, and it could be interesting to users of other queueing systems as well, so here&#8217;s a blog post instead.</p>
<p>When we first started using beanstalkd at Causes, some things in our worker development and deployment process took a while to iron out, bur our strategy for handling job failures worked quite well right from the start. In hindsight, I&#8217;m happy about it. This is what we did.</p>
<h2 id='the_basic_rule'>The Basic Rule</h2>
<p>Never clean up jobs by hand. If a failure happens once, it can happen again. Always write code to handle newly-discovered failure types automatically, then run the new code to do the cleanup.</p>
<h2 id='procedure'>Procedure</h2>
<p>Before you begin, note that your workers will be numerous, possibly even more so than your web front-ends. I assume you have good logging infrastructure and analysis tools for your web front ends. Use the same infrastructure for the workers, too. It will make your life easier to see all failures and performance data in one place.</p>
<ol>
<li>
<p>Start by having your workers bury any failed jobs.</p>
</li>
<li>
<p>See what sorts of failures happen in production (by using the high-quality logging that you have to do anyway).</p>
</li>
<li>
<p>You will see some failures where the job can simply be deleted, others where it&#8217;s better to retry the job, and possibly some rare cases where you want to save the job to be inspected by a human (though this sort of hand-holding does not scale and should be avoided). It might also make sense to retry some jobs only a limited number of times before deleting them.</p>
</li>
<li>
<p>Add unit tests and update the code to deal with these known failure types appropriately (i.e. delete or retry the job), but continue to bury unanticipated failures. For retries, don&#8217;t bother with changing the priority, but do add a time delay with exponential backoff. Of course, you must also fix the business logic to recover from these failures or avoid them entirely whenever possible.</p>
</li>
<li>
<p>Redeploy your application.</p>
</li>
<li>
<p>When the new code is in production, kick all buried jobs. They will be handled correctly, and you won&#8217;t lose any jobs.</p>
</li>
<li>
<p>Now look at your worker logs again. This process will have removed a lot of noise from your production logs, and new failure types will float to the surface (though the total volume will of course be much smaller). So repeat.</p>
</li>
</ol>
<p>After a couple of iterations, true failures will be very rare indeed. Your system will be running smoothly and it won&#8217;t need much attention.</p>Wed, 7 Apr 2010 03:16:23 GMTThe Closed iPad is a Moral Problemhttp://www.advogato.org/person/kr/diary.html?start=14
http://xph.us/2010/04/06/closed-ipad-is-a-moral-problem.html<p>At issue here is <em>control</em>. Apple wants to control what you can and can&#8217;t do with your computer. (To my knowledge no one has claimed this is false. Speculate all you like on Apple&#8217;s motivation for wanting this control; that&#8217;s beside the point.) I happen to find this morally objectionable.</p>
<p>Cory Doctorow and others have astutely noticed that people don&#8217;t respond much to arguments based on morality, so they framed <a href="" 'http://www.boingboing.net/2010/04/02/why-i-wont-buy-an-ipad-and-think-you-shouldnt-either.html'>their complaints</a> <a href="" 'http://diveintomark.org/archives/2010/01/29/tinkerers-sunset'>differently</a>, <a href="" 'http://al3x.net/2010/01/28/ipad.html'>emphasizing</a> <a href="" 'http://www.tbray.org/ongoing/When/201x/2010/01/27/iPad'>practical</a> <a href="" 'http://createdigitalmusic.com/2010/01/27/how-a-great-product-can-be-bad-news-apple-ipad-and-the-closed-mac/'>effects</a>. That was a smart strategy, because it let them be more persuasive, but make no mistake, this is a moral issue.</p>
<p>Unfortunately, some have failed to see past the surface of these arguments, causing them to write a bunch of increasingly <a href="" 'http://ironictitle.com/post/492064380/choosing-the-ipad'>irrelevant</a> <a href="" 'http://powazek.com/posts/2401'>rebuttals</a>.</p>
<p>Ultimately, I think both sides of this &#8220;debate&#8221; are falling victim to a massive confirmation bias. If you read a <a href="" 'http://chipotle.tumblr.com/post/491973132/screw-loose'>statement like this</a>:</p>
<blockquote>
<p>What makes products great is their innovation, their creativity, other ineffable qualities. Not the applicability of the first-sale doctrine.</p>
</blockquote>
<p>You may just nod in agreement, or you may say, &#8220;hold on there, bucko, that&#8217;s a hefty assertion, but an assertion is not an argument (or even evidence).&#8221; Same goes for <a href="" 'http://www.boingboing.net/2010/04/02/why-i-wont-buy-an-ipad-and-think-you-shouldnt-either.html'>something like this</a>:</p>
<blockquote>
<p>Buying an iPad for your kids isn&#8217;t a means of jump-starting the realization that the world is yours to take apart and reassemble; it&#8217;s a way of telling your offspring that even changing the batteries is something you have to leave to the professionals.</p>
</blockquote>
<p>A hundred little implicit (dis)agreements get strung together when you read one of these essays, and determine whether you find it convincing or repulsive.</p>
<p>The confirmation bias is especially strong here because everyone dances around the real issue without saying it outright: the closed nature of the iPad is morally wrong. As with any moral issue, it isn&#8217;t something you can argue for or against effectively without a groundwork of shared values. Either you recognize this issue or not. Either you consider it important or not.</p>
<p>Folks, <em>of course</em> the iPad will sell lots of units, because, in spite of its moral bankrupcy, it appeals to mass-market consumerism, and because it is backed by Apple&#8217;s powerful marketing machine. This may or may not qualify as &#8220;success&#8221;, depending on your point of view.</p>
<h2 id='untouchable_design'>Untouchable Design</h2>
<p>Why are Apple fans so worked up about this device, really?. Because of its <a href="" 'http://stevenf.tumblr.com/post/359224392/i-need-to-talk-to-you-about-computers-ive-been'>revolutionary design</a>?</p>
<blockquote>
<p>The bet is roughly that the future of computing:</p>
<ul>
<li>has a UI model based on direct manipulation of data objects</li>
<li>completely hides the filesystem from the user</li>
<li>favors ease of use and reduction of complexity over absolute flexibility</li>
<li>favors benefit to the end-user rather than the developer or other vendors</li>
<li>lives atop built-to-specific-purpose native applications and universally available web apps</li>
</ul>
</blockquote>
<p>Thing is, that describes the <a href="" 'http://www.litl.com/'>litl</a> spot-on. I think excitement about the iPad is much less about its design, and much more about the simple fact of Apple&#8217;s market position. If these radical design principles were really so important, folks would have been just as excited about the litl&#8217;s launch way way back in November.</p>
<p>This especially undermines all those <a href="" 'http://notes.torrez.org/2010/04/why-i-will-be-buying-an-ipad-this-weekend.html'>put-up-or-shut-up</a> <a href="" 'http://blog.fawny.org/2010/04/04/expertisedenial/'>arguments</a> about how nobody else competes with Apple&#8217;s design and that&#8217;s why the iPad is great despite its closed nature. I have yet to see a single thoughtful comment claiming that the iPad is good while the litl simultaneously is not. If someone manages to do this, not through speculation, but having actually used the litl (and even if we may disagree on the details or the conclusion), then great. Until then, you can&#8217;t credibly claim that no-one but Apple produces good design.</p>Sun, 7 Feb 2010 07:09:25 GMTDon’t Copy the Call Stackhttp://www.advogato.org/person/kr/diary.html?start=13
http://xph.us/2010/02/06/dont-copy-the-call-stack.html<p>Some runtimes claim to provide first-class continuations, but implement this by copying the entire call stack. This implementation strategy makes continuations totally unusable in production code, and it should be outlawed. Or maybe such runtimes should be required to call them &#8220;shitty continuations&#8221; instead of just &#8220;continuations&#8221;.</p>