This aspect of road design had never occurred to me, but once explained it makes sense. Great article on the design of an oblique crossroads junction and how it's unexpectedly dangerous due to human factors and car design.

“Human error” may be real, but so are techniques to mitigate or eliminate its effects — and driver training is poor when it comes to equipping people with those techniques, let alone habituating them. (And let alone reviewing knowledge of those techniques every few years.)

Much has been written on the pros and cons of microservices, but unfortunately I’m still seeing them as something being pursued in a cargo cult fashion in the growth-stage startup world. At the risk of rewriting Martin Fowler’s Microservice Premium article, I thought it would be good to write up some thoughts so that I can send them to clients when the topic arises, and hopefully help people avoid some of the mistakes I’ve seen. The mistake of choosing a path towards a given architecture or technology on the basis of so-called best practices articles found online is a costly one, and if I can help a single company avoid it then writing this will have been worth it.

Our usual advice to hardware founders is to focus on getting a product to market to test the core assumptions on actual target customers, and then iterate. Instead, Juicero spent $120M over two years to build a complex supply chain and perfectly engineered product that is too expensive for their target demographic.

Imagine a world where Juicero raised only $10M and built a product subject to significant constraints. Maybe the Press wouldn’t be so perfectly engineered but it might have a fewer features and cost a fraction of the original $699. Or maybe with a more iterative approach, they would have quickly found that customers vary greatly in their juice consumption patterns, and would have chosen a per-pack pricing model rather than one-size-fits-all $35/week subscription. Suddenly Juicero is incredibly compelling as a product offering, at least to this consumer.

Great font factoid: 'The name “Noto” comes from the little squares that show when a font is not supported by a computer. This are often referred to as “tofu”, because of their shape, therefore the font is short for No Tofu.'

Specifically, the following 3 classes of errors were implicated in 92% of the major production outages in this study and could have been caught with simple code review:

Error handlers that ignore errors (or just contain a log statement); error handlers with “TODO” or “FIXME” in the comment; and error handlers that catch an abstract exception type (e.g. Exception or Throwable in Java) and then take drastic action such as aborting the system.

(Interestingly, the latter was a particular favourite approach of some misplaced "fail fast"/"crash-only software design" dogma in Amazon. I wasn't a fan)

“A lot of people feel that they want to live in a cul-de-sac, they feel like it’s a safer place to be,” Marshall says. “The reality is yes, you’re safer – if you never leave your cul-de-sac. But if you actually move around town like a normal person, your town as a whole is much more dangerous.”

This is the opposite of what traffic engineers (and home buyers) have thought for decades. And it’s just the beginning of what we’re now starting to understand about the relative advantages of going back to the way we designed communities a century ago.

Marshall and Garrick took the same group of California cities and also examined all their minutely classified street networks for the amount of driving associated with them. On average, they found, people who live in more sparse, tree-like communities drive about 18 percent more than people who live in dense grids. And that’s a conservative calculation.

some amazingly terrible product decisions here. Deleting local copies of unreleased WAV files -- on the assumption that the user will simply listen to them streamed down from Apple Music -- that is astonishingly bad, and it's amazing they didn't consider the "freelance composer" use case at all. (via Tony Finch)

Any design that is hard to test is crap. Pure crap. Why? Because if it's hard to test, you aren't going to test it well enough. And if you don't test it well enough, it's not going to work when you need it to work. And if it doesn't work when you need it to work the design is crap.

Excellent cut-out-and-keep guide to why you should add a caching layer. I've been following this practice for the past few years, after I realised that #6 (recovering from a failed cache is hard) is a killer -- I've seen a few large-scale outages where a production system had gained enough scale that it required a cache to operate, and once that cache was damaged, bringing the system back online required a painful rewarming protocol. Better to design for the non-cached case if possible.

While someone can certainly make the case that an AK-47, or any other kind of gun or rifle is designed, nothing whose primary purpose is to take away life can be said to be designed well. And that attempting to separate an object from its function in order to appreciate it for purely aesthetic reasons, or to be impressed by its minimal elegance, is a coward’s way of justifying the death they’ve designed into the word, and the money with which they’re lining their pockets.

'9-patch uses png transparency to do an advanced form of 9-slice or scale9. The guides are straight, 1-pixel black lines drawn on the edge of your image that define the scaling and fill of your image. By naming your image file name.9.png, Android will recognize the 9.png format and use the black guides to scale and fill your bitmaps.'

When you put together teams of largely homogenous people of the same class and background, and pay them a lot of money, and when most of those people are under 30, it stands to reason that when someone in the room says, “Let’s do ‘your year in review, and front-load it with visuals,’” most folks in the room will imagine photos of skiing trips, parties, and awards shows— not photos of dead spouses, parents, and children.

So it comes back to this. When we talk about the need for diversity in tech, we’re not doing it because we like quota systems. Diverse backgrounds produce differing points of view. And those differences are needed if we are to put the flowering of internet genius to use actually helping humanity with its many terrifying and seemingly intractable problems.

a few GIFs of procedurally generated architecture by a game developer named Cedric, built using Unity. Cedric describes himself as an "indie game dev focused on social AI, emergent narrative and procedural worlds." Imagine whole game worlds powered by real-time computation at the building level, constantly and parametrically fizzing with architectural forms, barely predictable new Woolworth Buildings and Barbicans sprouting on-demand from the ground whenever needed.

The EU’s new consumer rights law bans certain dark patterns related to e-commerce across Europe. The “sneak into basket” pattern is now illegal. Full stop, end of story. You cannot create a situation where additional items and services are added by default. [...]

Forced continuity, when imposed on the user as a form of bait-and-switch, has been banned. Just the other day a web designer mentioned to me that he had only just discovered he had been charged for four years of annual membership dues in a “theme club”, having bought what he thought was a one-off theme. Since he lives in Europe, he may be able to claim all of this money back. All he needs to do is prove that the website did not inform him that the purchase included a membership with recurring payments.

While there are many defensible aspects of Systemd, other aspects boggle the mind. Not the least of these was that, as of a few months ago, trying to debug the kernel from the boot line would cause the system to crash. This was because of Systemd's voracious logging and the fact that Systemd responds to the "debug" flag on the kernel boot line -- a flag meant for the kernel, not anything else. That, straight up, is a bug.

However, the Systemd developers didn't see it that way and actively fought with those experiencing the problem. Add the fact that one of the Systemd developers was banned by Linus Torvalds for poor attitude and bad design and another was responsible for causing significant issues with Linux audio support, but blamed the problem on everything else but his software, and you have a bad situation on your hands.

There's no shortage of egos in the open source development world. There's no shortage of new ideas and veteran developers and administrators pooh-poohing something new simply because it's new. But there are also 45 years of history behind Unix and extremely good reasons it's still flourishing. Tools designed like Systemd do not fit the Linux mold, to their own detriment. Systemd's design has more in common with Windows than with Unix -- down to the binary logging.

The link re systemd consuming the "debug" kernel boot arg is a canonical example of inflexible coders refusing to fix their own bugs. (via Jason Dixon)

Just because something is "Dutch", that doesn't mean it's good. The Netherlands has many excellent examples, but you have to be very selective about what serves as a model. Cyclists fare best where their interactions with motor vehicles are limited and controlled. They fare best where infrastructure ensures that minor mistakes do not result in injuries.

Anywhere that we rely upon everyone behaving perfectly but where we do not protect the most vulnerable, there will be injuries. Good design takes human nature into account and removes the causes of danger from those who are most vulnerable.

A great reaction to Martin Fowler's "microservices" coinage, from Arnon Rotem-Gal-Oz:

'I guess it is easier to use a new name (Microservices) rather than say that this is what SOA actually meant'; 'these are the very principles of SOA before vendors pushed the [ESB] in the middle.'

Others have also chosen to define microservices slightly differently, as a service written in 10-100 LOC. Arnon's reaction:

“Nanoservice is an antipattern where a service is too fine-grained. A nanoservice is a service whose overhead (communications, maintenance, and so on) outweighs its utility.”

Having dealt with maintaining an over-fine-grained SOA stack in Amazon, I can only agree with this definition; it's easy to make things too fine-grained and create a raft of distributed-computing bugs and deployment/management complexity where there is no need to do so.

Whoa, I had no idea my knowledge of crypto was so out of date! For example:

ECC is going to replace RSA within the next 10 years. New systems probably shouldn’t use RSA at all.

This blogpost is full of similar useful guidelines and rules of thumb. Here's hoping I don't need to work on a low-level cryptosystem any time soon, as the risk of screwing it up is always high, but if I do this is a good reference for how it needs to be done nowadays.

This is a good, high-availability Redis configuration; sharded by userid across 8192 shards, with a Redis master/slave pair of instances for each set of N shards. I like their use of two redundancy systems -- hot slave and backup snapshots:

We run our cluster in a Redis master-slave configuration, and the slaves act as hot backups. Upon a master failure, we failover the slave as the new master and either bring up a new slave or reuse the old master as the new slave. We rely on ZooKeeper to make this as quick as possible.

Each master Redis instance (and slave instance) is configured to write to AOF on Amazon EBS. This ensures that if the Redis instances terminate unexpectedly then the loss of data is limited to 1 second of updates. The slave Redis instances also perform BGsave hourly which is then loaded to a more permanent store (Amazon S3). This copy is also used by Map Reduce jobs for analytics.

As a production system, we need many failure modes to guard ourselves. As mentioned, if the master host is down, we will manually failover to slave. If a single master Redis instance reboots, monit restart restores from AOF, implying a 1 second window of data loss on the shards on that instance. If the slave host goes down, we bring up a replacement. If a single slave Redis instance goes down, we rely on monit to restart using the AOF data. Because we may encounter AOF or BGsave file corruption, we BGSave and copy hourly backups to S3. Note that large file sizes can cause BGsave induced delays but in our cluster this is mitigated by smaller Redis data due to the sharding scheme.

In the Linux Kernel community Rusty Russell came up with a API rating scheme to help us determine if our API is sensible, or not. It's a rating from -10 to 10, where 10 is perfect is -10 is hell. Unfortunately there are too many examples at the wrong end of the scale.

A distributed, fault-tolerant "cron" is something which comes up frequently -- it makes for a great fault-tolerance building block. This one sounds like it's too closely tied into Mesos, though (IMO).

Chronos is our replacement for cron. It is a distributed and fault-tolerant scheduler which runs on top of Mesos. It's a framework and supports custom mesos executors as well as the default command executor. Thus by default, Chronos executes SH (on most systems BASH) scripts. Chronos can be used to interact with systems such as Hadoop (incl. EMR), even if the mesos slaves on which execution happens do not have Hadoop installed. Included wrapper scripts allow transfering files and executing them on a remote machine in the background and using asynchroneous callbacks to notify Chronos of job completion or failures.

'Edition has a ‘design for life’ philosophy - we think that unique designer-made items can be a part of our everyday lives without costing the earth. We stock affordable, contemporary and functional products (mostly handmade), including jewellery, home-ware, accessories, art and toys. Every item has been carefully selected and are all designed here in Ireland.'

I couldn't remember the name for this design principle, so it's worth a bookmark to remind me in future...

'This refers to computer programs that handle failures by simply restarting, without attempting any sophisticated recovery. Correctly written components of crash-only software can microreboot to a known-good state without the help of a user. Since failure-handling and normal startup use the same methods, this can increase the chance that bugs in failure-handling code will be noticed.'

'The companies out there that know how to make decent software have been steadily eating their way into and through markets previously dominated by the hardware guys. Apple with music players, TiVo with video recording, even Microsoft with its decade-old Xbox Live service, which continues to embarrass the far weaker offerings from Sony and Nintendo. (And, yes, iOS is embarrassing all three console makers.)'

See also Mat Honan's article at http://www.wired.com/gadgetlab/2012/12/internet-tv-sucks/ : 'Smart TVs are just too complicated. They have terrible user interfaces that differ wildly from device to device. It’s not always clear what content is even available — for example, after more than two years on the market, you still can’t watch Hulu Plus on your Google TV. [...] They give us too many options for apps most people will never use, and they do so at the expense of making it simple to find the shows and movies we want to watch, no matter where they are, be it online or on the air. As NPD puts it in the conclusion to its report, “OEMs and retailers need to focus less on new innovation in this space and more on simplification of the user experience and messaging if they want to drive additional, and new, behaviors on the TV.” Which is a more polite way of saying, clean up your horrible interface, Samsung.'

'Below is a list of some lessons I’ve learned as a distributed systems engineer that are worth being told to a new engineer. Some are subtle, and some are surprising, but none are controversial. This list is for the new distributed systems engineer to guide their thinking about the field they are taking on. It’s not comprehensive, but it’s a good beginning.' This is a pretty nice list, a little over-stated, but that's the format. I particularly like the following: 'Exploit data-locality'; 'Learn to estimate your capacity'; 'Metrics are the only way to get your job done'; 'Use percentiles, not averages'; 'Extract services'.

'A/B testing must be done in a modularized fashion. The “fail” case he gave was when Etsy spent months developing and testing infinite scroll to their search listings, only to find that it had a negative impact on engagement.' [...] 'instead of having the goal of “test infinite scroll,” Etsy realized it needed to test each assumption separately, and this going forward is their game plan.'

'thin software layers don’t add much value, especially when you have many such layers piled on each other. Each layer has to be pushed onto your mental stack as you dive into the code. Furthermore, the layers of phyllo dough are permeable, allowing the honey to soak through. But software abstractions are best when they don’t leak. When you pile layer on top of layer in software, the layers are bound to leak.'

John Carmack presciently defines the benefits of an event sourcing architecture in 1998, as a key part of Quake 3's design:

"The key point: Journaling of time along with other inputs turns a
realtime application into a batch process, with all the attendant
benefits for quality control and debugging. These problems, and
many more, just go away. With a full input trace, you can accurately
restart the session and play back to any point (conditional
breakpoint on a frame number), or let a session play back at an
arbitrarily degraded speed, but cover exactly the same code paths."

Japanese designer yuri suzuki has sent designboom images of his 'london underground circuit maps' project developed as part of the designers in residence program at the london design museum, on show until january 13th, 2013. responding to 'thrift' as a theme, suzuki's work explores communication systems in consumer electronics.
a printed circuit board (PCB) is used as a precedent for developing a electrical circuit influenced by harry beck's iconic
london underground map diagrams. by strategically positioning certain speaker, resistor and battery components throughout the map,
users can visually understand the complex networks associated with electricity and how power is generated within a radio.

'LifeSphere currently offers more than 90,000 unique products and is more than likely run by one person in a suburban bungalow in Phoenix. As far as I can gather their process consists of ALPHABETICALLY(!) applying every single image in the Public Domain photography archive to every object Zazzle offers. Amazingly almost everything they make is amazing. From doggy clothes featuring macrophotography of Chex Mix, to “Thanksgiving Shrimp” skateboard decks, LifeSphere proves 90,000 times over that rigorous process-based design yields infallibly fresh results.' (via Nelson)

'The New Aesthetics, or at least the aspect I’m looking at, is inspired by computer vision. And computer vision is at the point now that computer graphics was at 30 years ago. The New Aesthetics isn’t concerned with retro 8bit graphics of the past, but the 8bit graphics designed for machines of the now.' -- ie, The Robot Readable World, etc. Great essay, and exciting stuff

'Now that Dublin is in our bag, on our Tea Towel and across our Aprons, The Cake Café is going to create a new map of Ireland. We want to fill this map with all of your favorite places in land. Please send us locations that turn you on, fire your imaginations, or just fulfill your dreams; what ever you think should be included. Please pass the request on to friends in far flung parts of the land so they too can send their suggestions; natural or unnatural, animal or man made, a view, a corner of a field, an island or even a journey or hidden places to enjoy a picnic. -- thecakecafe /at/ gmail.com'.

Their map of Dublin is a work of genius -- I love that they include a decent chunk of the Northside, which was a notable failure of the Alljoy Design version. I can't wait to see what they come up with for Ireland.

word of the day, via a comment on http://www.jwz.org/blog/2012/01/snow-crash-simulated/ : 'A skeuomorph /ˈskjuːəmɔrf/ skew-ə-morf, or skeuomorphism (Greek: skeuos—vessel or tool, morphe—shape),[1] is a derivative object that retains ornamental design cues to a structure that was necessary in the original.[2] Skeuomorphs may be deliberately employed to make the new look comfortably old and familiar,[3] such as copper cladding on zinc pennies or computer printed postage with circular town name and cancellation lines'

'organizing your data for efficient processing, especially with respect to cache misses etc.' -- essentially an approach to breaking good OOP design practices in order to gain performance, seems to have come from the game-dev community

'nobody — almost literally 0% of users — uses the menu bar, and only 10% of users use the command bar. Nearly everybody is using the context menu or hotkeys. So the solution, obviously, is to make both the menu bar and the command bar bigger and more prominent. Right?
Microsoft UI has officially entered the realm of self-parody.' (via Nelson)

"As with any company, Apple consists of many divisions (Sales, Marketing, Customer Service, etc.) THE most powerful division at Apple is Industrial Design. For those of you unfamiliar with the term industrial design, this is the division that makes the decisions about the overall look and feel of Apple's products. And when I say "the most powerful", I mean that their decisions trump the decisions of any other division at Apple, including Engineering and Customer Service. Now it just so happens that the Industrial Design department HATES how a strain relief looks on a power adapter. They would much prefer to have a nice clean transition between the cable and the plug. Aesthetically, this does look nicer, but from an engineering point of view, it's pretty much committing reliability suicide. Because there is no strain relief, the cables fail at a very high rate because they get bent at very harsh angles. I'm sure that the Engineering division gave every reason in the world why a strain relief should be on an adapter cable, and Customer Service said how bad the customer experience would be if tons of adapters failed, but if industrial design doesn't like a strain relief, guess what, it gets removed."

Phil Gyford reworks the Grauniad's website using their open content API. I really like the navigation and just-the-text nature, but I still feel a need to know what other articles are "nearby", which this doesn't quite provide. Still, excellent work

'Of course you have to appreciate the irony – the agency in charge of enforcing France’s new anti-piracy legislation using a pirated proprietary font in its very own logo.' hoho! hoist by their own petard