Building the right thing, building it right, fast

Who I am

My name is Jakub Holy and I’m a software craftsmanship enthusiast and a (mainly JVM-based) developer since ~ 2005, consultant, and occasionally a project manager, working currently with TeliaSonera AS in Norway. More under About and #opinion.

Archive for May, 2012

This was a rich month, bringing some hope for ORM, providing a peep-hole into the bright and awesome future with in-browser video and other cool web-stuff presented at WebRebels 2012 and IDEs providing immediate feedback and visualisation. There were valuable articles about simplicity and quality in software and good talks about the lean startup (i.e. enabling innovation) and other topics.

Recommended Readings

M.Fowler: ORM Hate – Why ORM is actually a good solution – a very valuable article where Fowler opposes the popular trend of criticising Object-Relational Mappers such as Hibernate. Yes, using an ORM is difficult and a leaky abstraction – but that’s because the problem of mapping from a rich in-memory object model to a relational store is inherently difficult (and you need to do it with or w/o an ORM tool) and because those last 10-20% of DB access require human intelligence. If you can avoid the need for ORM by using the relational model also in memory or by using a NoSQL database with a data model that fits your in-memory model then it’s great to do so but often you can’t and then using ORM is the best solution. You certainly don’t want to code your own “lightweight” ORM.

Alan Quayle on WebRTC, the HTML5 standard for in-browser video/text communication – intro & status – this is an exciting technology coming to our browsers. Some quotes: “WebRTC enables applications such as voice calls, video chat, file sharing, messaging, white-boarding, gaming, human computer interaction, etc. without any client or plug-in download to run from a browser using simple HTML and JavaScript APIs. Real time communications becomes pervasive on the internet. … Essentially any browser becomes a SIP end point, a telephone, an ‘open’ Skype client, an end point for any real-time communication and control. … Likely by the end of this year we’ll see Chrome and Firefox running WebRTC.“

Communicating Sequential Processes: Theory for reasoning about concurrent, interacting processes – an inspirational reading about a much better way to do concurrency than Java threads; “… [CSP] is a language for describing patterns of interaction between concurrent objects. It is supported by an elegant, mathematical theory, a set of proof tools, and an extensive literature.” The beauty is that thanks to the theory behind, you can actually reason about the interactions and verify their correctness, contrary to the feared mess of Java threads. CSP is broadly similar to the popular Actors model and is implemented in Occam while it also influenced Erlang’s concurrency model and Go. The library JSCP brings it to Java. I guess we’re better of using Actors due to their popularity and maturity though the mathematical backing of CSP with the potential of formal proofs of correctness is indeed attractive. Any of the two is better than using threads directly because:

The monitor-threads model provided by Java, whilst easy to understand, proves very difficult to apply safely in any system above a modest level of complexity. One problem is that monitor methods are tightly interdependent, so that their semantics compose in non-trivial ways […]

Rich Hickey introduces the Reducers library: simplicity in practice – a beautiful example of simplifying something by taking appart all the unrelated but mingled concerns and focus only on those really needed. Whether you’re interested in Clojure or not, you should read the beginning of the post where Hickey explains how the current collection functions based on first (returns 1st element) and rest (returns the remaining ones) mix too many things (ordering, output representation, etc.) and how this “new super-generalized and minimal abstraction for collections” avoids that and thus provides e.g. for doing things in parallel and composing transformation without producing intermediate collections. Beautiful! (PS: I’ve blogged about more examples of pursuing simplicity & gaining power.)

M. Fowler: Cannot Measure Productivity – a thoughful discussion of why the productivity of programmers is hard/impossible to measure (i.e. you should concentrate on measuring other, more useful metrics) “[..] false measures only make things worse.”

Gojko Adzic: Redefining software quality – an obligatory read that introduces a holistic view of SW quality and the quality pyramid. The key idea is that there are multiple, vertically organized facets of quality and once a more basic facet is saturated, you should move and and concentrate on the next facet and level of quality. The quality pyramid: Deployable & functionally OK > Performant & secure > Usable > Useful > Successful. Once a particular level is satisfied, it is wasteful to put more effort into it and you’ll bring much more value to the customer by focusing on the next higher level. Gojko: “Yet from what I see most software teams invest, build and test only at the lowest two levels, gold-plating things without a way to explain why that is bad.”

Is Pair Programming for Me? – the author, who claims to have taught pair programming to 200+500 people, points out that pair programming is a skill that must be (consciously) learned, or actually a number of inter-personal skills. He also describes the cycle people go through when learning it, including a temporary downswing in productivity and negative view of pairing (therefore people should do it at least for 3-4 weeks to overcome the problems and gain the benefits).

Videos

Bret Victor: Inventing on Principle (55 min, see at least the first 5 min) – very inspiring! Victor firmly believes that “creators need an immediate connection to what they create” and demonstrates how this can be achieve when coding image rendering, a game, an algorithm, when designing a circuit. After watching it for few minutes you will think: How could we have been working with such crappy tools without realizing how limited they are? Fortunately people started to apply the idea of an immediate connection between code and the result, f.ex. in LightTable and Bikeshed‘s IDE. On the other hand, there is an evidence that this may be too hard with the current programming languages.

Eric Ries: Evangelizing for the Lean Startup – entertaining and enriching introduction into an approach for bringing innovation to life – f.ex. in startups – withou failing unnecessary, demonstrated on the example of the author’s failed and successful startup. Many innovators fail because they don’t realize that their key challenge is that they don’t now neither the problem (who are our customers and what they need) nor the solution (the product to satisfy the need) and thus what they need to do is to experiment and learn in the shortest cycles possible. If you wander what the buzz about lean startup is or how to build innovations, this is the ultimite source you should watch. The video has 1h but the first 20-30 min will give you a sufficient overview. The key points summarized by the Iterate lean guru Anders Haugeto:
dd

– There is only one way to measure progress: Progress == The amount of things you have learned from your real customers
– Hence, you need to work a continuous loop to build, measure and learn as fast as possible. Typical iterations, like sprints, are too long, hence inefficient
– Until you have an established product, even recognized engineering practices like TDD, sprints, refactoring, and all the XP-stuff are less important than this feedback cycle
– Even the perfect agile method is nothing, if you’re using it to build the wrong product: How can you know you are heading in the right direction?

WebRebels 2012 conference talks – I’d especially recommend the awakening talk by Zed Shaw pointing out that we’re building amazing things – but on top of crapy technologies without realizing anymore that the technologies are crappy and could/should be much better. Erlend Oftedal’s talk about webapp security was an (scary) eye-opener for me. If you’re considering offline webapps with HTML5’s webapp cache and/or local storage then you must listen to Jake Archibald’s painful story of various pitfalls hidden there. Christian Johansen’s Pure, functional JavaScript is a pleasure to listen to. Check out the program.

Useful Tools

Guard – cross-platform tool that can watch for file changes and execute actions (“guards”) when a file is changed, useful e.g. to execute tests/style checks only on the files being modified. Includes support for many testing/checking tools and multiple notification means such as Growl.

ThreadLogic – Thread dump analyzer that understands common patterns found in application servers and enabling the definition of custom patterns. Supports Sun, IBM, and JRockit.

Dumbster – mock SMTP server for unit testing (start in @Before, get sent messages in the test, stop afterwards)

Quotes

Kai Thomas Gilb, in a talk proposal for JavaZone 2012:

Accurate estimation is impossible for complex technical projects, but keeping to agreed budgets, and deadlines is achievable by using feedback and change.

Clojure Corner

StackOverflow: Clojure Performance Benchmarks – links to various discussions and benchmarks (beware that older results and facts are likely to be outdated). And of course you must keep in mind that 1) benchmark only measure what they measure, e.g. the outcomes cannot be generalized and that 2) benchmark results aren’t relevant for your problem unless you’re doing exactly the kind of operations being benchmarked (e.g. who cares that X is 100 ms slower if your code spends 1s waiting for a XML file download?) (Craig Andera had a pretty good experience report from webapp performance testing including what (not) to do)

Goldberg (at GitHub) – Johann Sebastian Bach’s Goldberg Variations in Overtone by @ctford; using Overtone and Clojure to build up mathematical and functional definitions of canons. Deon Garrett: “Go right now and download the code from Chris’ talk. If you don’t know Clojure, use this as an excuse to learn it – it’s that good.”

Reducing incidental complexity is a primary focus of Clojure, and you could dig into how it does that in every area.

…

In particular, the use of objects to represent simple informational data is almost criminal in its generation of per-piece-of-information micro-languages, i.e. the class methods, versus far more powerful, declarative, and generic methods like relational algebra. Inventing a class with its own interface to hold a piece of information is like inventing a new language to write every short story. This is anti-reuse, and, I think, results in an explosion of code in typical OO applications.

Have you ever worked with an application where you had to copy data from one object to another and another and so on before you actually could do something with it? Have you ever written code to convert data from XML to a DTO to a Business Object to a JDBC Statement? Again and again for each of the different data types being processed? Then you have encountered an all too common antipattern of many “enterprise” (read “overdesigned”) applications, which we could call The Endless Mapping Death March. Let’s look at an application suffering from this antipattern and how to rewrite it in a much nicer, leaner and easier to maintain form.

The application, The World of Thrilling Fashion (or WTF for short) collects and stores information about newly designed dresses and makes it available via a REST API. Every poor dress has to go through the following conversions before reaching a devoted fashion fan:

Parsing from XML into a XML-specific XDress object

Processing and conversion to an application-specific Dress object

Conversion to a MongoDB’s DBObject so that it can be stored in the DB (as JSON)

Conversion from the DBObject back to the Dress object

Conversion from Dress to a JSON string

Uff, that’s lot of work! Each of the conversions is coded manually and if we want to extend WTF to provide information also about trendy shoes, we will need to code all of them again. (Plus couple of methods in our MongoDAO, such as getAllShoes and storeShoes.) But we can do much better than that!

In Simple Made Easy argues Rich Hickey that mixing orthogonal concerns introduces unnecessary complexity and that we should keep them separate. This mixing sometimes occurs on such a basic level that we believe that there is no other way to do it, an example being the interleaving of polymorphism and hierarchical namespacing represented by OO class hierarchies. Taking those “complected” concerns apart and dealing with them separately yields cleaner, simpler solutions and sometimes also more powerful ones because you are free to combine them as you need and not as the author decided.

For a recent project I needed to be able to start on-demand clusters of machines in Amazon EC2. We needed each instance in a cluster to allow SSH and sudo access for all team members and to install and configure the software appropriate for that cluster (“database” node or “testclient” node).