Sunday, December 28, 2008

Code coverage reports are great, I love the information they give me. I also love the idea of failing a build if you code coverage metrics drop below a certain point. But I think it's generally accepted that code coverage numbers are can be very misleading. A low % of lines coverage is certainly bad, but a high % of lines covered doesn't necessarily mean you've done a good job either. You could have a bunch of tests that really don't exercise many edge cases, but they hit all the lines of code. It doesn't really mean that its been tested well.

I find this sort of thing happens with integration tests. One medium sized integration test, could "cover" lots and lots of code, I could remove a number of unit tests and still have the same coverage % because I have a lot of integration tests. These days I'm not as interested in the coverage % (okay I still want close to 100) but I'm really interested to know if I run Emma (for example) on FooTest, is Foo 100% covered? In my ideal world each Test would cover its related class 100%. I find the Emma plugin for eclipse really helpful to do that kind of analysis. And I'd love a tool that would give me that kind of report.

Sadly that tool doesn't exist. In the current world coverage metrics are great, but they leave something to be desired. After discovering the moreunit plugin I've realized how a tool like that could help enhance coverage metrics. I want to know for every public method in Foo is there a corresponding test method in FooTest. If you had this kind of metric in combination with code coverage % this could put a confidence value on how good your code coverage is. Sadly that tool doesn't exist either.

But what would be even greater than two tools that don't exist a third tool that combines the two. I would love to know that my FooTest.bar*() methods give me 100% coverage on Foo.bar() method. Having something like that would give me very high confidence in my code coverage metrics. I'm guessing as things move along in code quality metrics we'll start seeing tools like that being developed.

One issue with the moreunit tool is that to do it's analysis it requires test method naming conventions, that in my eyes seem to be in continual development. Google around a little bit... lots of people argue for really long names, junit3 required test to start off the test method names, http://blog.jayfields.com/2008/05/testing-value-of-test-names.html kind of believes there should be no test method names, I personally like to shun the java standard camel cased method names and go more of the ruby route and use underscores (my co-workers don't like that at all). In any case, it's clear (at least to me) that test naming is difficult. But the idea that you could get some really valuable reporting out of standardized test method names seems like good reason to embrace standardization (at least a little bit). I'm personally ready to start preceding all my test method names with "test" just to get some of the ad-hoc metrics that moreunit offers. I think once more tools come out that start assuming testing conventions we'll start to get even more value and flexibility out of our test code.

Update - it turns out that hacking the plugin wasn't too hard, I now have the plugin recognizing method names like "foo()" instead of "testFoo()". I'm not going to consider changing my method names after all!

wow... the moreUnit plugin for eclipse is just what I've been looking for, it's amazing. One of the more important things it does is associate Classes with Test Classes - So if you have Foo it knows that FooTest is related to it. From there it does a number of great things.

Handles renaming correctly - if you rename Foo to Foo2 it will automatically rename FooTest to Foo2Test. For that functionality it's worth it just to use this plugin.

^J toggles between Foo and FooTest AND if FooTest doesn't exist yet, it prompts you to create it. Woo hoo! I constantly create a new class - then CRTL N to create the test, this cuts down one anoying step, pretty cool.

It associates method names with test method names. And does a few awesome things with that association.

If you have:
Foo.bar() and FooTest.testBar()
you rename Foo.bar() to Foo.foo()
it will rename your test method to FooTest.testFoo()
it will even do multiple test methods if you had testBarA() - testBarB() - testBar*() - it will rename all of them.

Because it associates the methods with test methods - it can provide you with a view that shows you all the test methods your missing for a class (Emma's great, but it doesn't give you this kind of detail, you can always trick yourself into thinking you covered stuff when you haven't, this is pretty awesome, imagine failing the build if you don't have a testMethod for every method in a class, that I think would be much more valuable than failing on code coverage metrics). In this view you can also create the missing methods in you test class

I have a bone to pick though. In Junit4 your method names really shouldn't begin with the word test any longer, because you already have that information in the form of an annotation at the beginning of your test method - BUT for moreUnit to know that a method in a test class is associated to a method in your class under test, you need to prefix with "test"

Some of the documentation doesn't match up to the most recent version, but 1.2 was just released before Christmas, so I assume it will be updated at some point. For example I don't seems to be able to setup project specific settings - and really the test directory setting, doesn't appear to have any effect (It specifies the directory where you create new tests).

So I've just been playing with it for an hour or so, but this thing is great, I can't wait to start using it day to day

Monday, December 22, 2008

I recently made the switch from Bloglines to Google reader. I'm definitely happy I did. Google reader does a few things much better than bloglines.

It has some nice keyboard shortcuts http://www.google.com/help/reader/faq.html#shortcuts , n/p (next previous), o (open close), s (star), m (mark as un /read). The short cuts are simple to use + remember.

I really like the Feed / Subscription pane. It shows unread blog post in bold, but then it has all the other blog posts there below it. it's really easy to scan though old posts and find one your looking for.

Speaking of finding stuff you're looking for, it also allows you to star a post. There are a few posts that I go back and re-read. My usual MO is to add a bookmark in firefox to the post. But being able to star an item and come back to it is really a great feature for me.

Top Recommendations - I really like this, three links that google thinks I'd like. For example I found out No Fluff Just Stuff has a feed - which is an aggregation of all the speakers blog posts it's a really nice feed.

I'm interested to see if bloglines will do a redesign anytime soon. But for me it doesn't compete with Google Readers features and usability

Updated - I just have to add, I've been exploring the rest of the keyboard shortcuts for google reader + they're awesome no more mouse. The commands to browse your subscriptions are great. shift n/p is next previous subscription shift o is open subscription. This is a really nice job of modifying other similar key bindings for a different context... it's very easy to use. Then there are the g? commands. Go somewhere gs goto starred items, gh go home, gu I don't know what the u stands for but it is goto subscription. It brings up an awesome quicksilver esq type completion window, it's pretty and nice to use.

Monday, November 17, 2008

So I've been thinking about different kinds of contracts your software can have, hard contracts and soft contracts. If I look at the javadoc for List, it tells me what's expected of me if I implement a List.

add(int index, Object element)
Inserts the specified element at the specified position in this list (optional operation).

Great, I implement a List, BsList - and pass it off to you. Technically I did everything Java cared about, there's no compile errors, I implement List and provided all the proper methods. But there's nothing that verifies if I followed the intent of the List interface. The only thing you could do if you wanted to know if I did a good job would be write some unit tests. This is a soft contract. I implemented some interface and you assumed because I did that, my code follows the expected behavior of List.

Now imagine another world. Where Sun not only released the List interface with some Javadoc, but also provided developers with a set of tests. How much more confident would you be in BsList if you could run a Sun approved unit test suite?

Most interfaces talk not only about a syntactic contract (you implemented the interface), but have an expectation about how the implementation should behave. The problem is, that's only captured in javadoc. I think interfaces need harder contracts bundled with them, they need unit tests. And seriously, you can't tell me you wouldn't love implementing an interface that had a set of tests with it. The List interface is big, 23 methods, how do you know you got it right? And why should those tests ever be coded up more than once?

Who knows, maybe it would be a bad idea for Sun to do something like that, but I know I'll be providing unit tests with my interfaces.

Thursday, November 13, 2008

I've been working on some library code recently, and working on a few implementations of an interface. I wrote and tested my first implementation, then started on the second one. I didn't think about it in advance but of course about 90% of the tests needed to be the same. I realized I needed to verify that every implementation adhered to the API spec I laid out.

I ended up moving all the shared "API" tests into an abstract base class, with one abstract method, create(). So far so good, it feels a little weird to have an abstract class doing testing, but it's also kind of cool. Instead of having Javadoc describing your API there is a clean set of tests that you can look at. Also if you want to implement the interface, now not only do you have access to the minimum set of behavioral requirements but, it's also easy to run the tests against your implementation.

I wish every API I had to implement had a hard contract of tests that I had to pass, that would make life so much easier.

Wednesday, October 8, 2008

This visualization, called code_swarm, shows the history of commits in a software project. A commit happens when a developer makes changes to the code or documents and transfers them into the central project repository. Both developers and files are represented as moving elements. When a developer commits a file, it lights up and flies towards that developer. Files are colored according to their purpose, such as whether they are source code or a document. If files or developers have not been active for a while, they will fade away. A histogram at the bottom keeps a reminder of what has come before.

A developer did it on our repository and showed it at a team meeting, it was a lot of fun to watch, I definitely recommend it.

Thursday, October 2, 2008

Let's say you have a domain object and there is a DSL that acts upon the domain object. There are some constraints:

The DSL is only allowed to access certain methods on the domain object

If the DSL tries to access a method it's not allowed to, you can throw an exception

You didn't write the DSL it's something like MVEL (or it is MVEL), so you can't modify it.

You can wrap, modify, do what ever you want to the domain object before exposing it to the DSL.

The DSL looks like "foo() > 1". Where the DSL will has an object and will call foo() on it.

How would you implement this?

One option which is pretty straight forward, use a facade and expose to the DSL only the specific methods in from your domain object. This is not too bad if there are 3-5 methods, but assume there are 20 methods, or even there are 10 objects each with 20 methods you want to expose to the DSL. A facade might not be a bad idea, but it does start to incur some maintenance cost. And developer annoyance.

Another option would be to use meta-data / annotations (I know, I know Metadata is for noobs but I'm a noob, so it's okay for me to use it). Instead of the facade you could annotate each method you expose to the DSL. Great it's easy to annotate methods, but how would you throw an exception if the DSL tried to do some bad stuff? This is where something like AOP / CGLib can come into play. You proxy your object with CGlib and intercept all method calls, if the method does not contain your annotation then you throw an exception. Take your proxied object and expose it to the DSL. Now the DSL has a proxied version of your domain object that won't let it call methods on it that it shouldn't. Problem solved.

The real question is which is better the facade or the metadata? I'm really not sure.

Friday, September 26, 2008

I was listening to an On Point show last night about proverbial wisdom and it struck me how much software engineering is overwhelmed with proverbs. Someone even wrote a book of them, Programming Proverbs

Here are a few I can think of:

KISS (Keep It Simple Stupid)

DRY (Don't Repeat Yourself)

When in doubt leave it out.

Choose two: Good, Fast, Cheap

There's no silver bullet

“Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away” - Antoine de Saint Exupéry

No Broken Windows

People don't often take proverbs seriously. But I find them extremely useful when writing software. I don't think I'm the only person who finds them helpful. How often do you think about the DRY principle, or KISS, when writing software? These proverbs have invaded the language of software engineering. I think their value suggests something about either the nature of our industry, or the current state of it. I wonder if other industries are riddled with proverbs.

Saturday, September 20, 2008

Everyone loves to measure things. Eric told me the other day of a story he heard at NFJS about how Henry Ford publicly measured employees performance of producing I-Beams. The mere process of public measuring increased the number of I-Beams produced (I looked around on the internet, couldn't find any references).

Recently at work we started playing the Hudson Continuous Integration Game. In this "game" there is a public record of points. Your check-ins can net you points or loose you points. The rules that we play with are:

-10 points for breaking a build

0 points for breaking a build that already was broken

+1 points for doing a build with no failures (unstable builds gives no points)

-1 points for each new test failures

+1 points for each new test that passes

+3 points for removing a warning, TODO, or fixing findbugs errors

-3 points for checking in a warning, TODO, or creating findbugs errors

Each month we reset the scores. This is the third month we're doing this. The top three get prizes (a toy from the dollar store to display proudly on their desk), the looser also get a toy, the cockroach of shame. We're half way though this month, the top three all have 200+ points (at the moment I'm #2), then the point count drops off considerably. I believe #4 has 100 points, and the person in last place has -1 (note: 25% of the people playing are fulltime developers, the other 75% are scientists who do a little development, but everyone plays).

This is far from perfect measurement of performance, but I have to tell you the fear of public ridicule for having low points (or just my competitive nature) has certainly made me go right back and fix any findbugs errors, and implement TODO's rather than just leave them there. It's kind of neat on a personal level, but it also has had an effect on our team as well. It's encouraged other people to cleanup their warnings and fix easy problems and it's started a lot of discussions about good coding practice (I think this has been the most valuable thing it's done). The most controversial rule is loosing 3 points for checking in a TODO.

There are a number of people who feel that checking in a TODO shouldn't loose you any points, that it will encourage people to just not mark things as TODO when they should be. I can totally see this point. On the other hand, no matter how much I hate to loose points if I have 2-3 things in a month that really are TODOs and I don't have time to implement the feature right then, I'm okay loosing 6-9 points... I created work by checking in. I should get dinged. In my eyes this encourages people to not check-in if they're going to create work for other people.

The whole process has been very fun. And it's started a number of conversations about development with people who weren't talking about it so much. I'm currently measuring the success of the game by how much people are talking about it. This month that measurement is at 104, I hope next month the success of the game is 150.

It must be nice testing a UI that doesn't require massive amounts of state to get to the feature to test. Maybe testing in general will push app development towards a world where parts/features of apps require smaller and smaller amounts of state setup to work. I know testing has pushed my code in that direction. But what would it even mean for an app?

Monday, September 8, 2008

Just started using a JUnit style Antlr Testing package... called Antlr Testing, it's great, it let's you test your Lexer, Parser, and Treewalker. The documentation is light and good enough, and it's pretty easy to use (at least on a small grammar).

Saturday, September 6, 2008

So in fact the grammar I wrote yesterday is an LL(2) grammar. Antlr is an LL(*) parser, but you're allowed to specify how many tokens you want it to look ahead (I'm still unclear if Antlr is smart enough to take something that is LL(K) and generate source code that takes advantage of it).

LL(K) means for a given language the parser has to look forward up to K token to determine what rule you're using. There are parsing optimizations made if you specify the LL.

Friday, September 5, 2008

I'm doing a little more with Antlr these days, and I had made some assumptions that turn out not to be true. I had assumed that you would need clear terminations of rules (like new lines or semicolons or something), that rules couldn't overlap, but they can, and I believe this is related to the look ahead functionality of the antrl parser. Check it out:

What's really cool about this parser is that you can input to it "A:AAAAA:AA" and it can figure out to split that up into:

I thought I was going to get warnings about multiple execution paths, but I didn't! This makes it much easier to think about and write grammars. I have to assume that this is bad practice if building a medium to large language, but I'm only dealing with little languages, so I'm thinking it's okay for now

For me a DSL is another way a user can interact / configure a system so it seems related to but, different than a traditional UI. I think it's a closer abstraction to the domain than the UI, but you can easily imagine anything you do in a DSL you could do in the UI. So in a sense designing DSL first is a top-down approach, but in my very limited experience it's so close to the domain model, that it really feels like a middle-out approach.

Anyway, it was a really interesting experience and let me think about what features I needed in my domain layer very easily. And because I was playing with so many syntaxes I finally settled on something that let me loosen up the design quite a bit more than I would have if I had started in the domain layer first.

Monday, August 25, 2008

Refactor Extract Method, I had never used this much until last week. Neal Ford was talking it up in the productive programmer and a co-worker mentioned off hand that he used it all the time.

So I gave it a try and have must say, I've never used anything better when working with poorly written legacy code. crtl-shift-M bye bye 40 lines of nested if statements and for-loops hello descriptive name. Besides making the code prettier and easier to read I found it does 2 important things.

After blocks of code are broken up into specific groupings (methods), it's really easy to start re-organizing the code. With a lot of this code I'm working, the initial writer took short cuts, and bundled things together that made the logic confusing. Once I started extracting methods, it became really easy to start reordering and regrouping how these methods were called into something that read much easier.

After I was able to reorder things suddenly the code became declarative. The method that was doing all the work, said what it was going to do, and if you wanted to read the details, just pop into the methods. That this simple refactoring could lead to nice declarative code, for me is the real selling point.

So I had never thought refactor extract method was particularly useful, but now anytime I see code I don't want to look at, I just refactor it away, eclipse handles the magic, I end up with declarative code, and I love it!

Saturday, August 2, 2008

So I'm reading Neal Ford's book productive programmer, pretty good so far. Very much in the vein of the pragmatic programmer. It's full of little tools he likes. For example: two eclipse plugins from http://www.mousefeed.com/.

1) Close tabs with middle click. God I've always wanted that in Eclipse ever since I discovered it in firefox.

2) Key Promoter. I don't know if I'll love or hate this yet, but anytime you do something manually that could be a shortcut key it displays a little popup telling you the shortcut key and the number of times you haven't used it (just like IntelliJ does).

oh... and I just discovered crtl-shift L which pulls up a popup of all the Keybindings for eclipse.

Thursday, July 31, 2008

I've always thought of AOP in terms of things like AspectJ which manipulate byte code (I don't really love the idea of byte code injection) , and never really as a design pattern that could be implemented in Java. But lo and behold there are plenty of Java based implementations (I believe based on dynamic proxies). I've been using Spring's implementation of the AOP Alliance API http://aopalliance.sourceforge.net/doc/index.html

Using AOP really allows you to cut down on non-model related code in your model. One of the classic examples is logging, how great would it be to not see any logging code in your model code? We do lots of reporting on our models and sometimes it's very difficult to hook into the model at certain points to do the reporting.

We just started adding in AOP hooks to our factories. Which has been great. Now we just pass a method interceptor to a factory, and the factory takes care of wrapping the appropriate objects with our interceptor. On the reporting side we have a reference to our method interceptor. As our code is running the method interceptor is collecting state that we can report on. It's like totally non-invasive implementation of the observer pattern. I love it.

Suddenly we have a whole new world of reporting and testing that we can do which would have been very difficult previously.

Seriously, think about testing. If you're unlucky enough to be stuck with legacy code that's hard to work with, but lucky enough that the developers used factories. If you can slip in a mechanism for wrapping objects now suddenly what was difficult to test is easy! You just intercept a method call on an object you care about and verify that it's state is correct, or what ever the heck you want to do. I'm finding it über powerful.

But with über power comes über responsibility. Yeah an implementation of a method interceptor might look like so:

I just want to point out here you suddenly have access to an instance of foo and also all the arguments that were passed in. So woe unto thee who changes state whilst intercepting a method. Because good luck debugging that. Besides that troubling point, this stuff is awesome.

yeah, "lo and behold", and "woe unto thee", I wonder how many idioms I can slip into into a blog post without running out of steam.

Wednesday, July 16, 2008

The two Sets each represent the same object identities but they are being evaluated in different contexts. At some point later in time I need to find out the differences between S1.f1 and S2.f1

Now Set has some great methods on it like:

contains(Object o)

remove(Object o)

And then there's HashSet which is a set backed by a sweet sweet HashMap, which should make it fast to do contains() + remove().

So what I need to do at some point is:

for( F f : S1) {

F fLocal = S2.get(f);

//do some comparisons between fLocal and f

}

but I can't. The Set nor the HashSet API have a get(Object o). They make the assumption that if .equals() returns true there are no differences between the two objects. I think I disagree with this premise. At least in my recent experience .equals() is talking about an ID of an object. With that ID I wish to track the state of that object evaluated in different contexts.

Maybe this is abuse of .equals() and I should get over myself. But I wish Java's API's left it up to me to decide the meaning of .equals() and leave their API's a little more open ended.

Maybe it's not part of the core API, but at least they could have implemented this in HashSet.

Tuesday, July 8, 2008

After writing a Graphviz Dot language parser and one tree walker, I made the decision to go for a hub and spoke architecture vs a chained architecture. Where the hub was a Dot file and the spokes are multiple tree walkers.

I had already parsed the Dot language to populate an internal domain object. Next I needed to render this in a Java graphing library that allowed me to specify positional information for each node (I chose the prefuse library (http://prefuse.org)). I also knew I wanted to parse everything from one dot file that looked like so:

Write a new tree walker that knows about node and node attribute information, and directly populates a prefuse graph object.

Option #1 seemed like a chained approach where #2 was more like a hub and a spoke. Clearly context is king, but in my context, going for a hub and spoke approach really seems like the best way to go, and looked like soooooo much less work (Option #1 has like a million pieces where #2 had like 0, really which would you do?).

I implemented option #2 and after working out the fiddly bits with prefuse it only cost 2 hours to build and test.

I think that's pretty fast.

It seems that Antlr really encourages going towards a hub and spoke architecture and in this case I think that turned out to be a really good thing. Now I have two spokes... how to build more??!?!?! If two is good more must be better right?

One mistake I made, I didn't do any unit testing of the dot language parser. The tree walkers are well tested so the parser is at least tested via integration. I haven't spent enough time figuring out how to test the parser, so I guess that's the next step that should have been the first step, oh well.

Wednesday, July 2, 2008

I'm looking into graphing libraries right now and I've been working with Graphviz because some folks here use it. Another language / tool that also got mentioned was GraphML, which is from what I've been told a little more powerful than Graphviz.

Here's a sample of the Graphviz dot language:

graph g {
A -> B;
}

Guess what, that creates two nodes with circles around them and connects them with a line.

Unlike many other file formats for graphs, GraphML does not use a custom syntax. Instead, it is based on XML and hence ideally suited as a common denominator for all kinds of services generating, archiving, or processing graphs.

I guess so, but what's better to optimize for, a service using your graphing language, or the poor schmuck who has to use it?
Being more powerful is great, but it looks annoying to type. That's really what I want API / language designers to optimize for. How annoying will it be to type :)

Tuesday, July 1, 2008

I wrote a parser for the Graphviz dot language (http://www.graphviz.org/doc/info/lang.html) and spit out an AST containing the edges and subgraphs. I took that AST and populated a graph in our domain model. And honestly, once I got over the hump of learning Antlr, it was really easy.

I didn't purely look for an excuse to use the technology (okay a little bit). We needed a representation for a graph. Why invent one when dot is so nice, and will give you a nice visualization. So now I have a DSL for creating my domain model (dot), that also gives me a nice visualization of the kind of model I'm creating. Two different projections of the same artifact. That's really cool I think.

I liked this article, I thought it was interesting to think about different ways to enforce code quality. The article's main focus was comparing TDD to Clean Room Software Development. If you don't read the article: Clean Room sounds like incredibly intense code reviews where you have to 'prove' mathematically to your peers that your code works. But I think comparing Clean Room to automated testing is comparing apples to oranges.

How would Clean Room help with maintaining legacy or bad code? What's the point in having a room full of people pour over a 1000 line method that only weirdo geniuses can understand? If you're not working with legacy code, then Clean Room sounds great if you have a room of people who want to pour over your code, but who has that liberty. At all the million (read 3) places I've worked, intensive code reviews has never been a priority.

Even if you worked somewhere that code reviews were a priority, and everything was peer reviewed constantly and you didn't have any legacy code, then great, forget automated testing! But the minute you don't have all those things you need something else. Peer review is great, but it's very brittle. I think that's one of the advantages of automated testing, you have an artifact that lives on and provides at least some value.

The article did have a good point though, what is the value of automated testing + TDD? I think that's really hard to quantify. Personally automated testing has rarely been useful for finding bugs (but when it has I do jump up and down and tell EVERYONE I can, it's awesome!). I have found it very helpful for learning. Whether it's an API or Legacy code I have to maintain. I also find it very helpful for designing my code (aka TDD). I'm totally addicted to this.

These days I stub out all my classes and start writing tests. I write tests for everything I want to say, then fill in the blanks, then write more tests. It's a very top down approach, but it's working for me. And what really sells me on it is that it has great side effect, automated tests that every once in a while make me jump up and down because they found a bug.

And almost always (I just can't seem to help myself) I'm pulling those code blocks into methods then passing an interface in to do the work. I don't think this neccisarily makes things any clearer, but then I don't seem to be able to help myself from pulling these loops apart.

"Another question that always comes up: What about writing DRY code? For me, the value behind DRY has never been typing something once, but that when I want to make a change I only have to make it in one place. Since I'm advocating the duplication in a test file, you still only need to make the change in one place. You may need to do a find and replace instead of changing one helper method, but the other people who are stuck maintaining your tests will be very happy that you did."

for example I took some code that was generating matlab strings that looked like so: