Building the right thing, building it right, fast

Who I am

My name is Jakub Holy and I’m a software craftsmanship enthusiast and a (mainly JVM-based) developer since ~ 2005, consultant, and occasionally a project manager, working currently with TeliaSonera AS in Norway. More under About and #opinion.

Recommended Readings

Nathan Marz: Principles of Software Engineering, Part 1 – Nathan has worked with Big Data at Twitter and other places and really knows the perils or large, distributed, real-time systems and this post contains plenty of valuable advice for making robust, reliable SW. Main message: “there’s a lot of uncertainty in software engineering“; every SW operates correctly only for a certain range of inputs (including volume, HW it runs on, …) and you never control all of them so there always is an opportunity for failure; you can’t predict what inputs you will encounter in the wild. “[..] while software is deterministic, you can’t treat it as deterministic in any sort of practical sense if you want to build robust software.” “Making software robust is an iterative process: you build and test it as best you can, but inevitably in production you’ll discover new areas of the input space that lead to failure. Like rockets, it’s crucial to have excellent monitoring in place so that these issues can be diagnosed.“. From the content: Sources of uncertainty (bugs, humans, requirements, inputs, ..), Engineering for uncertainty (minimize dependencies, lessen % of cascading failure [JH: -> Hystrix], measure and monitor)

Suffering-oriented programming is certainly also worth reading (summary: do not start with great designs; only start generalizing and creating libs when you have suffered enough from doing things more manually and thus learned the domain; make it possible > make it beautiful > make it fast, repeat)

Jez Humble: The Case for Continuous Delivery – read to persuade manager about CD: “Still, many managers and executives remain unconvinced as to the benefits [of CD], and would like to know more about the economic drivers behind CD.” CD reduces waste: “[..]online controlled experiments (A/B tests) at Amazon. This approach created hundreds of millions of dollars of value[..],” reduces risks: “[..] Etsy, has a great presentation which describes how deploying more frequently improves the stability of web services.” CD makes development cheaper by reducing the cost of non-value-adding activities such as integration and testing. F.ex. HP got dev. costs down by 40%, dev cost/program by 78%

Request Quest (via @jraregris) – entertaining and educational intractive quiz regarding what does (not) trigger a request in browsers and differences between them (and deviances from the standard) – img, script, css, etc.

The REST Statelessness Constraint – a nice post about statelessness in REST if you, like me, don’t know REST so much in depth; highlights: Statelesness (and thus the requirement for clients to send their state with every request) is a trade-off crucial for web-scale and partially balanced by caching – while typical enterprise apps have different needs (more state, less scale) so REST isn’t a perfect match. Distinguish application (client-side) and server (resources) state. Using a DB to hold the state still violates the requirement. Use links to transfer some state (e.g. contain a link to fetch the next page of records in the response).

Cognitive Biases in Times of Uncertainty – people under pressure/stress start to focus on risks over gains and (very) short-term rather than long-term and thus also adopt 0-some mindset (i.e. if sb. else wins, I loose) => polarization into we x them and focus on getting as big piece of the cake possible at any price, now, dismissal of collaboration. With accelerating rate of change in the society due to technology, this is exactly what is happening. How to counter it? Create more positive narratives than the threat-based ones (views of the world), support them via short-term gains. Bottom line: each of us must work on spreading a more positive attitude to save us from bleak future.

Book – Nathan Marz: Big Data – I dislike the big data hype (and, with passion, Hadoop) but would love to read this book; it presents a fresh look at big data processing, heavily inspired by functional programming. Nathan has plenty of experiences from Twitter and creating Storm and Cascalog (both in Clojure, btw.). Read ch 1: A new paradigm for big data.

Schmetterling – Debug running clojure processes from the browser! – upon an exception, the process will pause and S. will show the stack, which you can navigate and see locals and run code in the context of any stack frame; you can also trigger it from your code

Local state is harmful – how can we answer the questions about when/why did state X change, how did output Y get where it is? Make state explicit, f.ex. one global map holding all of it, and perhaps not just the current state but also history – thus we can easily query it. Prismatic’ Graph can be used to make the state map, watches to keep history. Inspired by databases (Datomic is an excellent example of SW where answering such questions is trivial)

Nicholas Kariniemi: Why is Clojure bootstrapping so slow? – don’t blame the JVM, most time spent in clojure.core according to this analyzes on JVM and Android (create and set vars, load other namespaces); some proposals for improving it – lazy loading, excluding functionality not used, …

Cheat your way to running CLJS on Node – (ab)use D. Nolen’s mies template intended for client-side cljs development to create a Node project; the trick: compile everything into 1 file so that Node does not fail to find dependencies, disable source maps etc. Update: the nodecljs template now does this

A short story about the complexity of magical frameworks and dependency injection with a happy ending, featuring Resteasy, CDI, and JBoss.

Once upon time, I have created a JAX-RS webservice that needed to supply data to a user’s session. I wanted to be fancy and thus created a @Singleton class for the exchange of information between the two (since only a user request serving code can legally access her session, a global data exchange is needed). However sharing the singleton between the REST service and JSF handler wasn’t so easy:

Originally, the singleton was generic: OneTimeMailbox<T> – but this is not supported by CDI so I had to create a derived class (annotated with @Named @Singleton)

While everything worked in my Arquillian test, at runtime I got NullPointerException because the @Inject-ed mailbox was null in the service, for reasons unclear. According to the internets, CDI and JAX-RS do not blend well unless you use ugly tricks such as annotating your service with @RequestScoped (didn’t help me) or use JBoss’ resteasy-cdi module.

Finally I got fed up by all the complexity standing in my way and reverted to plain old Java singleton (OneTimeMailbox.getInstance()) while making testing possible with multiple instances by having a setter an alternative constructor taking the mailbox on each class using it (the service and JSF bean) (using a constructor might be even better).

I needed a quick and simple way to enable some users to query a table and figured out that the easiest solution was to use an embedded, ligthweight HTTP server so that the users could type a URL in their browser and get the results. The question was, of course, which server is best for it. I’d like to summarize here the options I’ve discovered – including Gretty, Jetty, Restlet, Jersey and others – and their pros & cons together with complete examples for most of them. I’ve on purpose avoided various frameworks that might support this easily such as Grails because it didn’t feel really lightweight and I needed only a very simple, temporary application.

I used Groovy for its high productivity, especially regarding JDBC – with GSQL I needed only two lines to get the data from a DB in a user-friendly format.

My ideal solution would make it possible to start the server with support for HTTPS and authorization and declare handlers for URLs programatically, in a single file (Groovy script), in just few lines of code. (Very similar to the Gretty solution below + the security stuff.)

Two phase release planning – the best way to plan something somehow reliably is to just start doing it, i.e. just start the project with the objective of answering “Can this team produce a respectable implementation of that system by that date?” in as short time as possible (i.e. few weeks). Then: “Phase 2: At this point, there’s a commitment: a respectable product will be released on a particular date. Now those paying for the product have to accept a brute fact: they will not know, until close to that date, just what that product will look like (its feature list). What they do know is that it will be the best product this development team can produce by that date.” Final words: “My success selling this approach has been mixed. People really like the feeling of certainty, even if it’s based on nothing more than a grand collective pretending.”

Jay Fields’ Thoughts: Compatible Opinions on Software – about teams and opinion conflicts – there are some areas where no opinion is really right (e.g. powerful language vs. powerful IDE) yet people may have very strong feeling about them. Be aware of what your opinions are and how strong they are – and compose teams so that they include more less people with compatible (not same!) opinions – because if you team people with strong opposing opinions, they’ll loose lot of productivity. Quotes: “I also believe that you can have two technically excellent people who have vastly different opinions on the most effective way to deliver software.” “I suggest that you do your best to avoid working with someone who has both an opposing view and is as inflexible as you are on the subject. The more central the subject is to the project, the more likely it is that productivity will be lost.”

JBoss Byteman 2.0.0: Bytecode Manipulation, Testing, Fault Injection, Logging – a Java agent which helps testing, tracing, and monitoring code, code is injected based on simple scripts (rules) in the event-condition-action form (the conditions may use counters, timers etc.). Contrary to AOP, there is no need to create classes or compile code. “Byteman is also simpler to use and easier to change, especially for testing and ad hoc logging purposes.” “Byteman was invented primarily to support automation of tests for multi-threaded and multi-JVM Java applications using a technique called fault injection.” It was used e.g. to orchestrate the timing of activities performed by independent threads, for monitoring and statistics gathering, for application testing via fault injection. Contains a JUnit4 Runner for easily instrumenting the code under test, it can automatically load a rule before a test and unload it afterwards:

How should code search work? – a thought-provoking article about how much better code completion could be if it profited more from patterns of usage in existing source codes – and how to achieve that. Intermediate results available in the Code Recommenders Eclipse plugin.

How to GET a Cup of Coffee, 10/2008 – good introduction into creating applications based on REST, explained on an example of building REST workflow for the ordering process in Starbucks – a “self-describing state machine”. The advantage of this article is that it presents the whole REST workflow with GET, OPTIONS, POST, PUT and “advanced” features such as the use of If-Unmodified-Since/If-Match, Precondition Failed, Conflict. The workflow steps are connected via the Location header and a custom <next> link tag with rel and uri. Other keywords: etag, microformats, HATEOS (-> derive the next resource to access from the links in the previous one), Atom and AtomPub, caching (web trades latency for scaleability; if 1+s latency isn’t acceptable than web isn’t the right platform), URI templates (-> more coupling than links in responses), evolution (-> links from responses, new transitions), idempotency. “The Web is a robust framework for integrating systems at local, enterprise, and Internet scale.”

Links to Keep

Tools, Libraries etc.

ClusterSSH – whatever commands you execute in the master SSH session are also execute in the slave sessions – useful if you often need to execute the same thing on multiple machines (requires Perl); to install on Mac: “brew install csshx”

Twitter Finagle – “library to implement asynchronous Remote Procedure Call (RPC) clients and servers. Finagle is flexible enough to support a variety of RPC styles, including request-response, streaming, and pipelining; for example, HTTP pipelining and Redis pipelining. It also makes it easy to work with stateful RPC styles; for example, RPCs that require authentication and those that support transactions.” Supports also failover/retry, service discovery, multiple protocol (e.g. http, thrift). Build on Netty, Java NIO. See the overview and architecture.

Eclipse Code Recommenders – interesting plugin in incubation that tries to bring more more intelligent completion based more on context and the wisdom of the crowds (i.e. patterns of usage in existing source codes) to Eclipse

Clojure Corner

Clojure/huh? – Clojure’s Governance and How It Got That Way – an interesting description how the development of Clojure and inclusion of new libraries is managed. “Rich is extremely conservative about adding features to the language, and he has impressed this view on Clojure/core for the purpose of screening tickets.” E.g. it took two years to get support for named arguments – but the result is a much better and cleaner way of doing it.

Quotes

A language that doesn’t affect the way you think about programming, is not worth knowing-

– Alan Perlis

Lisp is worth learning for the profound enlightenment experience you will have when you finally get it; that experience will make you a better programmer for the rest of your days, even if you never actually use Lisp itself a lot.

The logging in Jersey, the reference JAX-RS implementation, is little sub-optimal. For example if it cannot find a method producing the expected MIME type then it will return “Unsupported mime type” to the client but won’t log anything (which mime type was requested, which mime types are actually available, …). Debugging it isn’t exactly easy either, so what to do?