A day in my life. Thoughts on leadership, management, startups, technology, software, concurrent development, etc... Basically the stuff I think about from 10am to 6pm.

8/28/2006

Concurrent Development: Required Personality Traits

I came across this post by Rob Walling. Rob writes about four personality traits that the best software developers have. I’m going to use those traits and explain why any developer who hopes to make the shift to being a concurrent software developer must find someway to draw those traits to the surface of their psyche.

Pessimistic

While I wouldn’t have used the term pessimistic, (I would have used realist), the idea is sound. You must confront the reality of your situation. Period. You are an engineer. It is your job to make sure that the applications work and to anticipate ALL the things that can go wrong. Because if you don’t find it...someone else will. And if that someone is a customer it can be bad for your company and maybe even you.

Anticipating the worst requires mental discipline, a comfortable knowledge of how the system and software work, and the willingness to test your assumptions. All of this can be learned.

If you are using threading as your concurrency tool, it is absolutely vital that you understand context switching and time slices. You have to make sure that threaded code can lose a time slice at any point, still produce the expected results, and not cause other threads to fail. This needs to be tested and not just by the test team but by you. And not just on a single processor box, but on a multiprocessor box. Test is NOT a dirty word. You should be doing it early and often. Because when things go wrong...a good engineer makes sure that they know about it before anyone else.

Remember, you are not in marketing...you are an engineer; anticipate the worst and plan for it.

Angered By Sloppy Code

If the code is sloppy...then how can you understand it? If you are using threads there will come a point in the development process where the only tools you’ll have for isolating a thread problem will be a log file and your knowledge of the code. When you get to this point...sloppy code can kill you. Concurrency developers don’t have the luxury of unclear thinking or convoluted solutions. Clean code AND a clean architecture make it easier to identify shared resources and potential problems.

"Someone who fixes a problem but doesn't take the time to find out what caused it is doomed to never become an expert in their field." – Rob Walling

If you try to fix a thread problem without understanding it...I can guarantee that you didn’t fix the problem...you just moved it. Thread problems often manifest when a time slice is released at just the right time and you’re not protecting something as well as you think you are. If you just fix the symptoms then the problem will manifest again and will be even harder to track down next time. Fix it when it happens and fix it right.

Long Term Life Planners

"Cause and effect, chain of events, All of the chaos makes perfect sense, When you’re spinning round things come undone...Welcome to Earth 3rd rock from the Sun" – Joe Diffie

Yes, I listen to county western music. I love that song because it is such a perfect example of how one thing affects another to the point where things just get nuts. And with threads things can get really nuts if you’re not careful. That is why planning is so important.

Concurrency should be planned into your application from the beginning but if that isn’t an option for you...plan how you’re going to add it. Don’t just sit down and start writing code. Figure out all the Pieces Parts. Hope for the best, but plan for the worst.

Once you’ve figured out where you can add concurrency and how you’re going to add it...you need think about the future. To support concurrent applications you have to add more processes to your software development cycle.

Put processes in place to protect the code. For example, your team may have fifteen people on it of which only two are responsible for the threaded code. Your two threading engineers should code review all changes made by the other engineers that could even remotely affect the threaded code. If you are the engineering manager don’t make the mistake of thinking that everyone "gets" parallel development, because they don’t.

Also put in automated build and test processes designed to catch threading issues as close to the code change as possible. This will make tracking down the problems easier. Don’t forget to add logging and log levels to your applications and to TRAIN your test team.

Attention to Detail

Well...I think the need for this trait is summed up in the first three traits. The idea of paying attention to details is tied up in mental discipline, planning, and testing. Don't be lazy...just do it!

Conclusion

Concurrent software engineers have to juggle a harder programming model than sequential software engineers. So make it as easy on yourself as you can. Writing code for parallel execution requires that developers learn to think differently, it’s not just me waving my hands here.

"It's not intrinsically harder to write threads, but developers need to get used to thinking that way and we need help from the tools," Reinders said. "In the serial world, it doesn't matter which order you do things or how you break them down." - article link...

2 Comments:

RE: "Someone who fixes a problem but doesn't take the time to find out what caused it is doomed to never become an expert in their field."

I don't think he even phrased this strongly enough. If you "fix" the problem, but don't know what caused it, you haven't fixed it. You've changed the behavior, but you have no idea if it will come up again.

I think this relates strongly to your other posts about the complexities of threading. When a developer encounters a bug in multithreaded code, it is essential the s/he understand exactly what went wrong when fixing it. It's very easy to modify behavior in that case so that the bug stops appearing--without actually fixing the bug.

So I'll amend Rob's quote: "Someone who fixes a problem but doesn't take the time to find out what caused it hasn't fixed the problem."