Topics

Featured in Development

Alex Bradbury gives an overview of the status and development of RISC-V as it relates to modern operating systems, highlighting major research strands, controversies, and opportunities to get involved.

Featured in Architecture & Design

Will Jones talks about how Habito, the leading digital mortgage broker, benefited from using Haskell, some of the wins and trade-offs that have brought it to where it is today and where it's going next. He also talks about why functional programming is beneficial for large projects, and how it helps especially with migrating the data store.

Featured in AI, ML & Data Engineering

Katharine Jarmul discusses research related to fair-and-private ML algorithms and privacy-preserving models, showing that caring about privacy can help ensure a better model overall and support ethics.

Featured in Culture & Methods

This personal experience report shows that political in-house games and bad corporate culture are not only annoying and a waste of time, but also harm a lot of initiatives for improvement. Whenever we become aware of the blame game, we should address it! DevOps wants to deliver high quality. The willingness to make things better - products, processes, collaboration, and more - is vital.

Featured in DevOps

Service mesh architectures enable a control and observability loop. At the moment, service mesh implementations vary in regard to API and technology, and this shows no signs of slowing down. Building on top of volatile APIs can be hazardous. Here we suggest to use a simplified, workflow-friendly API to shield organization platform code from specific service-mesh implementation details.

Kevlin Henney on Worse is Better and Programming with GUTS

At the recent Agile Singapore conference Kevlin Henney gave two talks focusing on the importance of simplicity in architecture and implementation and on programming with Good Unit Tests (GUTS). He spoke to InfoQ about the thinking behind his talks and how they can be implemented.

InfoQ: Kevlin, thank you for taking the time to talk to InfoQ today. We are at the Agile Singapore Conference and you've given two talks at the conference: "Worse is Better, for Better or for Worse" and "Programming with GUTs". Let’s start with the first one: why is worse better?

Kevlin: That title is in some ways unfortunate. It's not my phrase. It comes from Richard Gabriel. He coined the phrase nearly 25 years ago and he explored the idea very much through the 90's. When people hear the phrase, "Worse is Better," they immediately think it's a compromise on what we could call intrinsic quality. When people think of good and bad software and they apply worse or better to it, they're thinking in terms of bugginess and things like that. That's not what it meant. People hear the term and they immediately say, "Yeah. That's what we do." and really it isn’t better.

InfoQ: Worse is worse.

Kevlin: Or they mistake it for the “good enough software” approach, which is Ed Yourdon's approach. Richard Gabriel was saying something slightly different. The thing that he considered that some people might consider worse is the compromise on scope. He was looking and contrasting what he considered to be two different schools of development.

There was the idea of approaches that favored a simple large scale design, simple interfaces — the UI level and the intercomponent level — the idea of consistency being very much a dominant characteristic – this all sounds very reasonable – correctness also being a dominant characteristic, again, very reasonable. But the key thing that he said differentiated this kind of approach was completeness as a maximalist kind of thing. People would think a piece of software should cover all the reasonable cases and anticipate those that were within shooting distance of the core functionality.

But looking around, he's seeing people being successful with another approach and this is his “worse is better”, where indeed there is simplicity, but it's mostly simplicity of implementation than of interface. And indeed there is correctness and there is consistency but the dominant characteristic is reduced completeness. Focus on a small scope.

See if you could start with a small scope, if that works out that's great. If it doesn't work out, then because you didn't have much scope to start with then it's quite easy to change because you had a simple implementation. So there is this idea of kind of growing software, experimenting with software, an incremental approach and it's worth just going back to what I said - this is 25 years ago. This predates not only a lot of Agile thinking but also RAD thinking in the early '90s.

InfoQ: We'd call it lean startup today.

Kevlin: Indeed. Especially the idea of experiment, if that doesn't work out that's fine. Try again. You've got something there.

A couple of years ago, I started getting interested (I can't remember what the trigger was) in going back to this. I’d been involved in a panel discussion at OOPSLA in 2000 that was on this. And I don't think I've really looked at the idea for over a decade and I sort of went back to it, re-read some of his stuff and was surprised at how much I'd forgotten but also surprised at how much was incredibly fresh.

And so some element of it is here’s stuff we kind of already knew, or at least some people were aware of. It predates a lot of the thinking that we find in Agile but also it tells us a few things we might be missing. It tells us a little bit more in terms of some of the qualities of the software as a product that sit outside what we might call the mainstream Agilethinking. It's not that people aren't doing it. It's just that perhaps it's pushed to the fringes or people are doing it but not saying this is one of the most important characteristics that you may do. So that's why I returned to it.

InfoQ: So building the product for extensibility.

Kevlin: Well, not really building the product for extensibility. Building it so that it is easy to throw away, for a very qualified use of the term throw away. I'm fond of pointing out that we talk about throw-away software but, in software development, we never throw it away. It just doesn't get integrated into the main product, but actually thrown away? No. The source code really just endures on somebody's hard drive or somewhere in the cloud.

There is this idea that you make it small. He emphasizes this stuff should be small, should be small and simple in implementation because, if it's small and simple, if you get it right then it's easy to extend. In that sense, yes to extensibility. But not by putting hooks and all kinds of anchors for possibilities that we're just guessing at, but making it so simple that extension is not a problem. Making it so simple that rewriting is not a problem and making it so simple that throwing it away and starting again is not a problem.

InfoQ: So that is a design philosophy.

Kevlin: It is. And indeed that's a very good way of describing it. It is a very broad philosophy about not just the code itself but how we should treat the code and the product as a whole and that these options are open. It's okay to say "we didn't get that right" and "actually this is not the place I would like to start from". It's okay to say "this will take a rewrite, because now we know how it works". He also emphasizes correctness very, very highly. So therefore definitely not what most people think of as worse is better.

InfoQ: So that intrinsic build quality - what you have built is really good.

Kevlin: Yes. What you have built is good.

InfoQ: But it's small and safe to discard, simple to extend.

Kevlin: Exactly. And in terms of execution quality, going back to the question of usability, he says to implement a simple interface, but anything adequate will do. The real thing that you need to focus on is the implementation because the chances are if you've got something that's good, you or someone will eventually want to change it. And it's very difficult to change things that are not good and are complex. If you've got something that's not quite right, you'll still need to change it in some way or you're going to discard it. If you want to discard it, you want to make sure you have the knowledge that you had in there, which is difficult with complexity. But he emphasizes in terms of execution qualities -- speed, one of the things that makes a product usable.

People often overlook this. They often focus purely on the visual aspect but one of these dominant usability characteristics is performance, plain and simple. That's it. You got a budget of a hundred milliseconds to turn a user action around and if you can do something within that, they think it's instantaneous. Anything longer and it is perceived as less useful. You have to start compensating for that.

InfoQ: These are things that Philippe Kruchten calls the Architecturally Significant Nonfunctionals.

Kevlin: Yes. That's right, although I have a mild allergy to the word nonfunctionals!

InfoQ: Me too. To me they're qualities.

Kevlin: They are qualities, yes. And we can be much more specific. Nonfunctional is a wonderfully vague English word. We're trying to define something by its negative. What qualities are important? Well, not the functional ones. It doesn't really make sense. We also use the term nonfunctional to describe something that is broken! When we talk about nonfunctional software that's generally not a good thing.

So a nonfunctional behavior is generally and literally a thing you don't want. It's a thing we work against rather than for.

What we're interested in are qualities of execution or qualities of development. As "nonfunctionals", these have merged together. Labelling these different categories as "nonfunctional requirements" mixes things together that happen at different times: at run time, such as speed and memory usage, versus development time, which is a completely different organism. Gabriel in his writing goes into this area, which he broadly calls habitability, in terms of the development quality of the code, which encompasses and touches on things like maintainability, and so on.

InfoQ: And that's where this simplicity would have to be at its core.

Kevlin: Yes. The idea is you are creating something that is either easy to change or easy to discard and in both cases simplicity is paramount because the bottleneck in software development is not typing. It is comprehension. It is understanding and if I look at something and I say, I understand how to build another version of this but better, then I'm a long way ahead of somebody who has a large, vested, mass of stuff that they don't understand. Or I can simply say I understand this and I can change it. I understand it. I can discard it. This is an immense power that we potentially have.

InfoQ: So again thinking of it at an architectural level where we're building products that are based on these simple components, your interfaces have got to be simple.

Kevlin: Yes indeed.

InfoQ: That you literally truly can. Did that experiment. Took it. It worked. That one failed. Throw it away. This one worked. Change it. It comes back again, what you were saying these are ideas from 25 years ago. This stuff we've known about and as an industry we know how to do. Why don't we do it?

Kevlin: That's an interesting one. I think there are a number of reasons. I'm going to dismiss one of the most commonly cited reasons, the youth of software development. A lot of people cite that. They say it's very young and so on but it's not that young. It's really not that young and I'm not even talking of Charles Babbage and stuff like that. I'm really talking in terms of stuff going back to the middle of the last century. There are professions that are younger and more mature.

So I don't think it's a matter of age. Somebody said growing old is mandatory but growing up is optional and I think it's the second bit. I think it's the maturity; often maturity comes about because of a need. I don't think that software development has really had that need thrust upon it because money is pouring into it. When the money dries up things tend to change. We see maturing change in engineering professions and other professions, because of fears over safety, because over fears over viability. Any concern expressed economically, people change. They have a cause to change and to change their practice.

As it is software development is growing. We're still in an inflationary phase and there are bits that are suffering and there are bits that are doing very well, but there are no brakes. There's no real slow down. There's no strong, core demand for quality as a necessary driver of the state of our practice. We see this in islands, archipelagos even, that are growing but we don't see a driven need that covers all software development.

I think there is that. There is also the fact that there is a lot of innovation and so therefore it's not simply the front of software development is quite distant from the back if you are at the trailing edge. It's there in many cases people are doing things in profoundly different environments and they simply don't hear of what other people are doing. All the knowledge is out there. I’ll borrow a quote from William Gibson, the author. He observed that “the future is already here, it's just unevenly distributed”.

I think that we have that with software development. We know collectively how to develop good software, software that people want to use and software that is of robust and appropriate quality and software that can be changed and changes but that knowledge is not distributed evenly across all software development.

InfoQ: Yes. One of the things that I'm personally passionate about is that I look after the Agile Manifesto translation program. One of the core principles is technical excellence. I do not see technical excellence in many teams.

Kevlin: Exactly. People often look at the four values but they don't tend to look at the principles. I've got into the habit of pointing this out, particularly when talking to folks about architecture when they regard architecture and agility as either two different camps or as actively antagonistic. I'm quite happy to derive one from the other. I highlight the principle that technical excellence is key. Also the idea of maximizing the amount of work not done, right back to the simplicity. And so it's that idea that I think perhaps we are very good at skim reading the headlines but we're not so good at going a little bit further than that.

InfoQ: It's a page and a half and there are 12.

Kevlin: "There's 12 of them, yeah. But it's four values. I read those but the other 12, well, that's a bit much. That's 300% extra!"

InfoQ: One of the speakers yesterday said that we can keep four things in our mind at once.

Kevlin: There we go. There you have them.

InfoQ: Thank you very much for that. So Programming with GUTs. What's GUTs?

Kevlin: GUTs is a phrase -- Good Unit Tests. It's a phrase that Alistair Cockburn coined in 2008 so perhaps not last century but last decade. He coined it when trying to describe a concern and a discussion that people were having around test-driven developed. And I've experienced this for myself when I do training, when I do consultancy and when I respond to emails. There is this perception that some people are using the term TDD simply to describe better testing practices than they had before.

They have gone perhaps from a zero-testing culture – or a zero-unit-testing culture – to "we're doing some unit tests and for us that is a radical change". They are experiencing some kind of personal revolution in that sense and they want a name for this. Just "testing" is not enough. It's not quite descriptive enough and the driven aspect sounds appealing and they sort of say, "Well, yeah. We are using the tools that people associate with TDD. We are doing things in an Agile way, broadly." And they may without realizing misappropriate the term and they're not strictly speaking doing Test-Driven Development.

Test-Driven Development is a thing that has a particular kind of mechanics. It's a very specific description. So it's not a good thing or a bad thing that you are doing or not doing it but they're trying to describe what they're doing and they're lacking a term. And there are a number of ways of describing traditional style unit testing but it was more of the case that Alistair identified more the objective and the journey than the mechanics.

Good Unit Test -- that is, as it were, a destination rather than Test-Driven Development which is a way of getting you there, and what Jacob Proffitt has called Plain Old Unit Testing, POUT or POUTing which is a slightly more fun term than just test after, is another way you might wish to get a Good Unit Test. You can also do sort of defect driven style. Respond to defect by writing tests and so we refer to that as DDT, some pesticide for bugs.

There are many different ways of achieving good unit tests. These are different roads by which you will arrive at that destination. So what GUTs is about is more of the “what does a good unit test look like?” A lot of people have focused on the mechanics of TDD, perhaps less on what makes a good unit test. I think that discussion historically was missing and that's something I have spotted when people are saying, "Well, we've got lots of test." Well, yes. But are they good? How do we define good?

InfoQ: So what makes a Good Unit Test?

Kevlin: I guess one of the obvious ones is that a Good Unit Test is something that is readable. That seems trivial thing to start with but I think it's a very important thing to start with because there is still a perception that tests are second class citizens, that they are there for the machine to execute rather than for the human read. But the “real stuff”, if you like, is the production code. And that is a problem when we come back to it because we are left with legacy. This is how you create legacy code. This is how you create legacy test code. This is how you create a legacy test code. We are left with this residue of stuff. And yeah, sure, it runs and things go green but we don't know why they go green. It's almost an act of software archaeology to interpret the tests themselves.

So they need to start off as first class citizens. I guess there's the seemingly trivial point that they should be automated. I assume that, but I think it's worth emphasizing. Sometimes when people are coming from very different development cultures, it is worth emphasizing that actually, yeah, we do mean code testing code, not humans testing software. So the written tests are in a programming language and they are readable to a developer as comfortably and easily as what we consider clean code in production code. So applying the same standards at that level.

The next thing is what does a test test? How do I know that it makes sense? It’s almost a trivial point but it's worth going back. I stopped using the terminology “test case” a few years ago. Although it's an old term, it felt too JUnitspecific and not necessarily corresponding to the things I was pointing out, the artifact, the test method, so to speak.

And I stopped using that term many years ago but a couple of years ago I started using it again because the phrasing itself makes a very nice point. A test case should be a case and what we find is that many test cases are not cases. They are many cases. They are not individual. There is a test method and it tests multiple aspects of an individual object or an individual function. It's this idea of aligning a test method with a case of behavior and we can kind of see that this idea is central if you start looking at things like behavior-driven development.

If you look at Steve Freeman and Nat Price's TDD work, we see this idea is characterized strongly here but it's not often conveyed to other people as this is generally a good way of just testing. This is not something that is isolated to the use of this overall style that Steve and Nathave. It's not isolated to the idea of using BDD. This is just a general property of tests, unit tests. It makes them readable.

And therefore that leads us to the naming and the partitioning and we get the idea of test cohesion and all of these other aspects where people debate in core code and they apply various principles, one way and another, they also apply to test. But they look slightly different but there is the idea of cohesion. There is the idea of test having narrative. There is the idea of tests falling into naming hierarchies and just being very, very clear about what these are. This is sort of a stylistic point that transcends the mechanics of any individual framework. That for me is this idea that the test carries the weight of specification.

InfoQ: Are we moving that concept of the unit test up into the level of the system, the acceptance test?

Kevlin: It can become that. I think in this case if we always remember that software is in some level recursive, we build large pieces of software out of smaller pieces of software and so on and so on. Most of the history of software development has been about the characterization of larger and smaller pieces that we can make, whether we go down to the ability to pass the blocks, individual blocks of code which a number of languages have – if they didn't already have them they now have the ability to pass lambdas or some form of closure concept around; that is a bit of a big shift.

So we have the idea of I could pass these things around but I can also pass a whole system and components around and refer to them. It seems to be that idea and so therefore there is this idea of I can test the smallest things and I can build larger things out of small things and therefore I could test those and so on and so on. Then we hit a boundary where there is a human and we give that a different name. We may say that is an acceptance test or system level test of some type.

Although the tools you may use at these different depths may differ, the sensibility behind them is the same. The idea is that when I'm testing an object, I am testing requirements against that object — not requirements given by a user and not requirements that are owned by a product owner, but requirements that are stimulated by another piece of code. If this piece of code, this object, does not meet those requirements, then the other piece of code cannot rely on it. It cannot use it. There is a contract here and we are using case law rather than other contractual methods to express what we mean by this. Offer somebody aguarantee of behavior.

InfoQ: Where do we get it the relationship to behavior-driven development, domain-specific languages, allowing a nontechnical person to express those needs and getting them translated through?

Kevlin: Well, I think you can get them from everywhere. I wouldn't like to say there is a single source. A number of examples I use in training are comprehensible to somebody who is not a developer and in those cases you can end up doing sometimes we call peer programming. In peer programming you have a domain expert on one side. You have the developer on the other and there is an easy conversation between the two.

The identifiers in the code carry the weight of description and that's the point: it is easy for non-programmers to skim through just as I, as a non-Chinese speaker, can skim through a newspaper written in Chinese and identify a number of relevant items. I can identify phone numbers. I can identify URLs. I can identify names written in Roman alphabet and so on. I can skim through and identify those readily and discuss at that level.

Similarly, somebody who is peer programming can go through code and identify a number of relevant terms. The identifiers carry the weight of the domain if they are expressed at the right level. You can have that conversation without even getting to the question of tools that are much more focused on using English. Well, sure, that's another level but I want to say that we can actually take this right down to the level where people are messing about with C, not even OO languages, not even something that is in fully managed environment.

From the testing point of view, it's going to be much easier than looking at the mechanics of a piece of production code because there's a simple narrative — there's a given-when-then; there's an arrange-act-assert; it's the same three-act play and it's relatively syntax light once you know to see through it, once you've learned what to pick out. So even at that level, we can get other people reading the tests.

InfoQ: And again, these are ideas that we have for a long, long time.

Kevlin: Indeed.

InfoQ: I learned COBOL programming in 1982 and I was told name your variables carefully.

Kevlin: Yes. None of these things is in that sense new but it's more of a case of perhaps our way of expressing them or emphasizing them. Perhaps the way that we lend weight to certain ideas. But even just basic techniques. You can compare and contrast a test case that was merely written to test code and one that was written to explain, specify or define it. Both of them are testing for correctness. The challenge is not "does it work?" but what do we mean by “it works”? That's the hard question.

So that I can read the test and say that this is what we mean by “it works", and when we run it, we understand what we mean by "it works" rather than just "something succeeded but we don't know what". And it's that idea. Perhaps it's a little more nuanced than it was a few decades ago where we used to just say “use good names”. I was told that and I did Fortran as my first main programming language. Yeah, I remember that advice and guess what, everybody in the office took it differently. So perhaps we've better at identifying concrete examples that we can learn from. I think that maybe the difference now from then.

About the Interviewee

Kevlin Henney is an independent consultant, trainer and writer based in the UK. His development interests are in patterns, programming, practice and process. He has been a columnist for many magazines and web sites and is co-author of A Pattern Language for Distributed Computing and On Patterns and Pattern Languages, two volumes in the Pattern-Oriented Software Architecture series.