We’ve all done it. And we’ve almost all regretted doing it. So its time to talk about an uncomfortable subject for many.

Copying and pasting code.

The temptation is constantly there. You see some code here that works (or at least appears to work). You obviously don’t want to reinvent the wheel, and maybe some aspect of the code makes re-factoring it out difficult. Maybe you agonize about it a little, or maybe you blissfully ignore the dangers. In the end, a Control-C and Control-V later, and that block of code has reproduced it self. The world doesn’t tragically end, everyone you know doesn’t die of a horrible disease, and your system still continues to function. So you figure, hey, copying and pasting code isn’t so bad, and so copy and paste that same line of code again. And again. And again. And before you know it, your computer’s clipboard has become an indispensable tool. Maybe you go even so far as to push others to follow in your copy and pasting footsteps. I’ve seen people put in their documentation something along the lines of “if you want to do x, copy this code from file y”. And the world still continues to go on.

Then, suddenly things change.

You find a bug in that original code you copied. Its an easy fix, except that single bug has now reproduced like a virus, infecting your entire system. Or maybe that code was golden, but the requirements do what they love to do so much and completely change. Again, a quick fix to the original code will fix it, but the copying and pasting has, best case, increased the difficulty of the fix by a factor of the number of times you copied that code. Worst case, different copies will have subtle differences (maybe some are still on an older revision of the change), maybe some are so different you are no longer able to recognize their lineage, but under layers of re-factoring the old functionality still lives on.

We all know this risk, but problem is actually more severe than what I’ve described.

Copying and pasting the same code, the same functionality, the same patterns; it all means you’ve missed a chance to abstract out some part of your system. If you have dozens of classes that follow nearly the same pattern, there is something important about that pattern that needs to be captured at a higher level. Not only will that make your code more maintainable, abstracting out these patterns makes it easier to reason about. You can make better deductions about code that is made up of higher abstractions than code that simply looks similar to that other code over there.

And of course in many cases, copying and pasting code is a symptom of a larger problem, a lack of understanding of the original code. It is too easy to find code that does something similar to what you want it to do, and then copying it verbatim. But is that “something similar” exactly what you want? Maybe its doing something subtly different, something you might not need or even want. Maybe there were valid assumptions made when that code was originally written that you have no business making. But if you took the easy route and just copied it without understanding it, you will have no idea that is the case

In other cases its a sign of laziness. Not the type of laziness that avoids work, in fact re-factoring to prevent the need for a copy and paste job is usually less work. But a more intellectual type of laziness. Work that just involves repeating something someone else did is easy, whereas if you can easily get past that easy part you are stuck with more challenging work. Moving code around is easy, solving problems is the hard part.

Of course its not always your fault. Maybe you are working in a language that makes certain kinds of patterns difficult to abstract, maybe even to the point it appears its actively resisting the concept of you being productive (I suppose it could be worse for Lambda, considering what is happening with Jigsaw). Are there are plenty of times when what you are copying actually is too small to be successfully reused. I’m not saying never copy and paste code, or never reuse the same patterns or functionality. Just next time you catch yourself doing it, please stop and ask yourself the following question:

A little while ago I wrote a post about ReSTful web services and how they are distinct from a typical “HTTP done right” web service. I have heard complaints that the characteristics of a ReSTful web service (a given resource will have links to related resources which can be used by the client as state transitions) are only applicable when the client using it a web browser, and that therefore they are more applicable to web interfaces than web services.

While it is true that it was web sites developed for browser based clients that Roy Fielding was originally describing, a web browser is far from the only client that can benefit from this type of web service. Any sort of generic client or client library will benefit from being able to resolve transitions from just looking at the resource. One obvious example most software engineers working with a ReSTful web service will quickly recognize the value of are browser plugins such as Chrome’s ReST Console, which allow you to browse the web service (including navigating links to related resources/states). But there is a much more clear example of a use of ReST staring most of you in the face.

I’m of course talking about RSS/Atom feeds.

Different content providers use completely different url schemes for their articles. For instance WordPress blogs use a url like this:

Specifically the date (In /YYYY/MM/DD format) followed by the blog title. An unfortunate scheme it turns out if something in there has to change (in the above mentioned link, I had actually published the post in June, but for some reason WordPress gave it an April date. Then when I tried updating the publish date, the link was suddenly broken).

Just some long id that probably links to something in their database (nothing to show you the article is about 3D printing airplanes (wait, let me finish my post before you go to that link, it won’t be too long!)).

I guess they wanted both an unique id and a contextual id (I’m presuming they have better editors than me to make sure typos don’t make it in there).

So if you were a fan of my blog, the Wall Street Journal, and the Washington Post, can you build an application the syndicated news from all three sources? Would you need to handle three different url patterns to deal with three different web service designs?

Of course not, thanks to the fact that all three are going to have RSS/atom feeds. These feed formats (admittingly there are several competing formats, including atom and the many versions of rss, but much fewer than the number of sources that use them) have a standard way to link to the article, to describe the context, include pictures, authors, etc. Thus any feed reader that knows how to parse this format can syndicate news from a huge number of providers.

These feeds are so common, we tend to refrain from calling them web services (they’ve certainly been around since before ReSTful web services were the hot new thing in software development). But not only are they web services, they are web services designed to be used by automated clients that track the news and display articles in news tickers or web page mashups or email clients or whatever. And they benefit tremendously from having the atom/rss resources including not just content about the article, but an actual link the to the article. If the wordpress feed just gave the date the article was published on and its title, or the Wall Street Journal feed just gave the internal id they are using, that would be enough to figure out the link if you knew their format. But it would be much more difficult to develop an aggregator that worked on such a feed.

I think the key here is that the strength of these services come not from the url patterns (which so many seem to get so hung up on when designing web services) but rather the resource representation. A good resource representation which is independent from the web service’s implementation, and potentially independent from the webservice itself. It can become a de facto industry standard, allowing client developers to develop clients and libraries around it.

But if your resource representation lacks any sort of ReSTful syntax, it doesn’t matter how popular it is. Developing a library around it is not enough. The consumer still has to find a way to parse your web service to get access to the resources. And no matter how much time you slave over documenting your ideal url formats (which, the First Law of Web Services dictates, if you like your url format, 80% of the rest of the world will think it is crap), that is additional work you are putting on client developers. Their lives will be much easier if you can compact the parts of your web service they need to work on to just the resource representation.

But there is one caveat to all this, your web service’s design is hardly its most important aspect. If the information your service is exposing is useful, developers will be willing to work with it, no matter how poorly designed it is. I think every software developer out there with more than a year or two of experience can point to some service or library they used that was a pain to develop against, but they still used it because it gave them data they needed. And the inverse of the above statement is also true; if the information you are exposing isn’t useful, your web service will never get used no matter how well it is designed. So if you are spending months slaving over the perfect web service, you are probably wasting time you can better spend improving the data it is serving, or delaying your entry to the market, either one of which can do a better job crippling your service much more than a bad API.

Years ago at one of my previous jobs, I remarked to a coworker when we were trying to move the organization to agile that “Before, we were about as agile as an 80 year old man with arthritis. Now we are about as agile as an 80 year old man with arthritis wearing a leotard.” We had done a lot of work dressing ourselves up with agile methodologies, but our core way of operating did not change. And when that happens, the result is about as pretty as the analogy implies.

Agile, it seems, is yet another example of a promising technique that has degraded down to a technology buzzword. As an abstract concept it is easy to wrap your head around, but attempts to implement it seem to have mixed results. But I think there is still quite a bit of usefulness in it, if you know where to look. Lets start by pulling out the thing that began the whole movement, the Agile Manifesto.

Manifesto for Agile Software Development

We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value: Individuals and interactions over processes and tools Working software over comprehensive documentation Customer collaboration over contract negotiation Responding to change over following a plan

That is, while there is value in the items on the right, we value the items on the left more.

I find a few things about this interesting. Lets start with the first value. “Individuals and interactions over processes and tools”. Yet despite this ‘value’, think about how many agile tools are in existence. Tools that let you draw up story boards, generate burn down charts, calculate velocity, etc., all without needing to directly interact with other individuals in the project (and don’t get me started about ‘agile processes’). As far as the second one goes, I’ve heard it remarked how ironic that a philosophy that demphasizes documentation has so many books written about it. The third one is a nice idea, but largely depends on your customer. Often customers see the whole point of paying someone to build a software product for them is so they don’t have to be involved, and to be frank, I’m not certain they are wrong.

But its the last value that I think is both the most core to agile development, and is most often misunderstood. Being able to quickly respond to change, such as new requirements or new discoveries about existing requirements. Because unless you are working on a tiny project you can hack out in a weekend, its unlikely what you finish building looks much like what you had in your head when you started. This idea has been often interpreted as meaning skimping on the design phase, and just go through endless cycles of code, test, repeat until you get something that meets your requirement.

In fact, I think this tends to accomplish the exact opposite of the stated goal; being able to respond to change (unless by change you mean throw everything out and start from scratch). Without considering design principles such as encapsulation and separation of concerns, changing the direction of your codebase is going to be a painful exercise of pushing mounds of unmaintainable spaghetti code around. And not going through a design phase requires you to solidify assumptions in your codebase, which then becomes hard to fix should those assumptions turn out to be false (which, as agilists like to remind us, they likely will).

The best way to write code that can respond to uncertainities and changing requirements is to break the problem down. Then should a requirement change that renders one part of it obsolete, you don’t have to rewrite your entire application, just that single module. And if each module can be made small enough that its risks can be better understood, you will have a much better chance at mitigating those risks once you have reduced the scope of the problem to a small enough size. Of course this break down requires some up front design and thought about the problem. You can’t just divide everything up willy nilly and expect to get pieces that will eventually fit together. For instance, if you have requirements that the application be both reliable and scalable, you can’t separate those two out as distinct requirements. Not only are those requirements that tend to be impacted by every part of your application, but if they are implemented incorrectly they can conflict each other.

Nowhere in the Agile Manifesto’s right side items is the word ‘design’ mentioned as a lesser value, so its odd that its so often assumed agile dismisses it. In fact, much of what agile preaches applies to design just as much as code. Work with your stakeholders when designing software, don’t just throw it in a modeling tool never to be seen again. Make sure your design is focused on working software, not just building up documentation. Make sure your design is not too rigid that it can never respond to change. And always, always remember to test your design decisions, which typically requires working with your customers to make sure what you are designing is what they indeed want.

ReSTful web services have been popular for some time, yet still today there is confusion about what it means. Some will say it is a web service that doesn’t use SOAP standards, others will argue it has to do with using HTTP’s PUT and DELETE methods along with their more popular siblings GET and POST, and still others will argue it has to do with an even more confusing acronym HATEOAS. Everyone seems to have an opinion on what it means, and the only thing that is clear is that they are talking about different things. Which is fairly ironic considering the term is an acronym for a pretty clear idea and its spelled out fairly clearly in Roy Fielding’s paper.

A lot of the problem seems to come down to ReST being marketed as the alternative to SOAP, not as simply an alternative to SOAP. As a result, any web service not using SOAP standards is assumed to be a ReSTful web service. For instance, the delicious API is often used as an example of a bad ReST API, but that really isn’t fair. Its not that it is a bad ReST API, its just not a ReST API at all. Its an RPC API, calling it ReSTful gives neither the term nor the API justice. It certainly is simpler than SOAP APIs, but that isn’t the defining characteristic of ReSTful web services.

Another web service flavor that commonly are confused for ReST are resource oriented web services (for brevity, and since this industry has a fetish for acronyms, lets call them ROWS). A resource oriented architecture is an API that is centered around doing basic functions (POST, GET, PUT, DELETE, in the HTTP vernacular, though create, read, update, delete may be more familiar to developers) around a set of resources. This is a common feature of ReSTful web services, and is indeed a powerful notion. Proponents of RPC style services like to argue that their APIs allow them to do things more complicated that basic “CRUD” operations, but that’s only true if your resources were chosen poorly. The essential difference between the two comes down to how the web service is built up. If you build it with a simple set of resources, you need a complex set of procedures to act on them. And if you built it with a simple set of methods, you need a complex set of resources to act on. Of course you could have both a complex set of resources and procedures, but then you just end up with a convoluted mess.

So if you have a ROWS, how is that not a ReSTful web service? After all, much of the argument for ReST web services is that they use HTTP “correctly” (as opposed to RPC web services which overload GET or POST methods to implement their dozens of procedures), and a properly implemented ROWS does exactly that. But ReST does not stand for “Use HTTP correctly”, Roy Fielding is not that bad of a speller.

ReST stands for Representational State Transfer. What the hell is that? Exactly what it sounds like. It means the resource’s representation contains the means for an application to change to a different state. Think about when you are navigating a typical web page. When you want to move to a different state in the application, you don’t usually go to your url bar and enter in a new url, or make arbitrary POST requests with a set of parameters you got from the web page documentation. You click on a link on the current page. Each web page (or resource, that’s what the R in URL stands for after all) contains links to take you to later states in the application. There is only minimal need for further documentation, basically just the entry point (the home URL) and some way to interpret each resource. The idea behind ReSTful web services is that the same principle can be used in web services that are meant to be accessed by an application other than a web browser. State transitions can be driven not through some hard coded set of URL patterns in the application, but by processing the resource and seeing what the valid next states are (of course with the exception of a very generic application (such as a web browser or other navigational tool), the application will still have to be coded to interpret the resource and pull out the state transition links, so something may still have to be hard coded). So if you were accessing a social media web service and wanted to transition from the state of a person to that person’s friends, you would essentially just access the getFriends link, much as you would access a getFriends method of a Person object in an Object Oriented program.

Admittingly, most of the advantages ReSTful web services have over their SOAP brethren have nothing to do with their being ReST web services but more to do with their resource oriented nature (which is much easier to both work with and scale than RPC services) or their simplicity (even a simple RPC API like delicious’ is going to be easier to use than a SOAP based version). ReST just brings in some structure like the SOAP standards give their respective APIs to web services that are simple and easy to use. But if you don’t need a fully ReSTful API with links from one resource to the other, don’t feel bad about developing a ROWS or even simple RPC web service. For a web service that is designed to be used by a single application, you probably don’t need that additional structure. And for a web service with only one or two resources, ReST is very likely going to be overkill. But please don’t call it a ReSTful web service. Terminology in this industry is already complicated enough without people misusing terms like ReST.

You a member of an agile development team planning out your next sprint. You have estimate your velocity at 33. You currently have a load of 32. There is a remaining story that has been estimated as having 2 story points (using a Fibonacci sequence). Would it be a mistake to try to fit it?

If you find yourself asking this, you are doing it wrong.

One premise of most agile techniques is that we are really bad at estimating. Story points do not try to correct that fact, they simply work around it. Unless you have the gift of psychic clairvoyance, there is no point in attempting high precision estimates because any such estimate will be wrong. Hence the use sizes that increase either exponentially or through a Fibonacci sequence. Assuming you were reasonably accurate, that story you estimated at 13 points might be as little as 12 points. Or it might be closer to 15.

Remember high school chemistry when you learned about significant digits? Story points are so low precision they don’t even have one significant digit. And in a calculation involving low precision measurements, claiming a higher precision result is misleading at best. Its downright fraudulant at worst.

So back to our above scenario, claiming your story point load is “32” is wrong. You don’t have enough precision in your measurements to say that. In reality, your load without that extra story is better expressed as “around 30”. And with the extra story, it is also “around 30”. If your current load is dominated by a couple 13 point stories, those are what will determine whether or not you meet your goal. If it is dominated by many small 1, 2, or 3 point stories, you are misleading yourself if you argue you can predict exactly how many you are going to finish.

Is my point that you should give up on estimating? Of course not. Just don’t obsess yourself with getting all your numbers to line up. Commit to enough that you feel comfortable with, and then give yourself plenty of stretch goals. That way you can meet your commitment if your estimates were too low, and you will have enough to do the entire sprint if your estimates were too high. Because all you really know is that it is unlikely your estimates are spot on.

Just the other day, my brand new Transformer Prime tablet arrived. Aside from being a high quality tablet (quad core processor, one of the very first to offer Android 4.0), it is well known for having a docking station accessory, complete with a keyboard, that essentially transforms it from a tablet to a 10 inch net-book. My phone, a HTC G2, also has a fold out keyboard, as did its predecessor, my old G1 phone. So I think its safe to say, I am a fan of physical keyboards. Sure, voice recognition can be good for some things, and Swype produces a nice on screen keyboard, but if I want to type anything of substance, I’m much more comfortable typing it on a nice hard QWERTY keyboard with actual buttons I can press.

Which makes the fact that I’m writing this post a bit ironic.

There was an interesting podcast last week from the IEEE about the keyboard going the way of the typewriter. Of course I was rather dismayed by the thought. Its not just that oncreen touch keyboards will replace them, but that new input devices, such a stylus with handwriting recognition or a microphone with voice recognition, will be the computer input of the future.

I would argue we are nowhere close to that with today’s technology, at least from where I stand. Being that my Transformer keyboard hasn’t arrived, I originally tried to “write” this blog post with my tablet’s “voice keyboard”, and I couldn’t get through the first paragraph without getting frustrated and giving up. I haven’t really tried using a stylus recently, but handwriting such as mine is typically so bad even I have trouble reading it, so I won’t begrudge any computer program which can’t read it (and before you try to say that’s just because I’m so used to keyboards I’ve lost the ability to write neatly, my grade school teachers would be quick to point out I never had that skill, even before I learned to type). On the other hand, I can type at a reasonably fast pace with pretty good accuracy, so there is no debate on which method is more proficient for me.

But one argument made on the podcast was that kids today will likely grow up so used to voice recognition and handwriting recognition that they may view keyboards as obsolete. That they may offer a technically superior method of writing fast will not matter to them. After all, one could easily argue that command line interfaces can be much more productive that GUIs for many tasks, but outside of hard core hackers, the world has largely moved away from them. Even software developers have largely embraced tools such as Eclipse as an alternative to hacking on the command line.

And I can’t deny that there are some areas which keyboards are not very good at. For instance, look at writing math problems. Math is typically full of Greek letters, superscript/subscripts, and other things which are just plain hard to type. Sure, there are usually obscure keyboard shortcuts for them, and specialized software for it (such as Mathematica), but no real general purpose solution. When I was trying to take notes for the Stanford Machine Learning class on Evernote last year, I can’t tell you how much time I wasted trying to come up with notations for random symbols that kept on coming up.

And of course more creative endeavors such as building “mind maps”, that is just hard to do without a more free flow input format. That’s why many still argue that pen and paper is a superior note taking device. Keyboards are great for writing lines of text using a small set of well known characters, but are rather limited beyond that.

So as keyboard-less input becomes more and more mainstream, how will that affect computer programming? Today, programming is a perfect example of lines of text optimized for keyboard input. Using voice recognition to write a Java program? How insane would that be? “For, begin paren, double, no capital ‘D’ double input colon input-capital-V-vals, end paren, open bracket” instead of just typing:

for (Double input: inputVals) {

Case sensitivity, the frequency of special characters and common symbols, terse variable names, camelCase, none of that will work with voice recognition input. Computer programming is clearly not a place where you want creative, free form input, but you want to heavily restrict it to what are legal values.

Or is it?

Will computer languages evolve to utilize the advantages of newer input methods. Will they start to incorporate more free-form writing rather than just plain text? Will it even be possible to come up with languages like that? Or will future freshman computer science students have to spend hours learning ancient typing techniques that have become obsolete outside of writing programs?