Professor

Department of Computer Science

My Favorite Sayings

The greatest performance improvement of all is when a system goes from
not-working to working

Programmers tend to worry too much and too soon about performance.
Many college-level Computer Science classes focus on fancy
algorithms to improve performance, but in real life performance
rarely matters.
Most real-world programs run plenty fast enough on today's machines
without any particular attention to performance. The real challenges
are getting programs completed quickly, ensuring their quality, and
managing the complexity of large applications. Thus the primary design
criterion for software should be simplicity, not speed.

Occasionally there will be parts of a program where
performance matters, but you probably won't be able to predict
where the performance issues will occur. If you try to optimize the
performance of an application during the initial construction you
will add complexity that will impact the timely delivery and quality
of the application and probably won't help performance at all; in
fact, it could actually reduce the performance ("faster" algorithms
often have larger constant factors, meaning they are slower at
small scale and only become more efficient at large scale).
I've found that in most situations the simplest code is also the
fastest.
So, don't worry about performance until the application is running;
if it isn't fast enough, then go in and carefully measure to figure
out where the performance bottlenecks are (they are likely to be
in places you wouldn't have guessed). Tune only the places where
you have measured that there is an issue.

Use your intuition to ask questions, not to answer them

Intuition is a wonderful thing. Once you have acquired knowledge and
experience in an area, you start getting gut-level feelings about the
right way to handle certain situations or problems, and these
intuitions can save large amounts of time and effort.
However, it's easy to become overconfident and assume that your
intuition is infallible, and
this can lead to mistakes. So, I try to treat intuition as a
hypothesis to be verified, not an edict to be followed blindly.

For example, intuition works great when tracking down bugs; if
I get a
sense for where I think the problem is I can quickly go to
the code and verify whether this really is the problem.
For more abstract tasks such as design I find that intuition
can also be valuable (I get a vague sense that a particular
approach is good or bad), but the intuition needs to be followed
up with a lot of additional analysis to expose all the underlying
factors and verify whether the intuition was correct. The intuition
helps me to focus my analysis, but it doesn't eliminate the need
for analysis.

One area where people frequently misuse their intuition is
performance analysis. Developers often jump to conclusions about
the source of a performance problem and run off to make changes
without making measurements to be sure that the intuition is
correct ("Of course it's the xyz that is slow"). More often
than not they are wrong, and the change ends up making the system
more complicated without fixing the problem.

Another reason for constantly challenging and validating your
intuitions is that over time this will sharpen your intuitions
so that they work even better for you. Ironically, people who
are most dogmatic about their intuitions often seem to have
least well-developed intuitions. If they would challenge their
intuitions more, they would find that their intuitions become
more accurate.

The most important component of evolution is death

Or, said another way, it's easier to create a new organism
than to change an existing one. Most organisms are highly
resistant to change, but when they die it becomes possible
for new and improved organisms to take their place. This rule
applies to social structures such as corporations as well
as biological organisms: very few companies are capable of
making significant changes in their culture or business
model, so it is good for companies eventually to go out
of business, thereby opening space for better companies in
the future.

Computer software is a counter-example to this rule, with ironic
results. Software is incredibly malleable: it can be
updated with new versions relatively easily to fix problems
and add new features. It is easier to change existing software
than to build new software, so software tends to live a long
time. To a first approximation,
software doesn't die (compare this to the hardware in a
computer, which is completely replaced every few years).
At the same time, it is difficult to make major structural
improvements to software once it has been shipped, so mistakes
in early versions of the program often live forever. As a
result, software tends to live too long: just good enough
to discourage replacement, but slowly rotting away with more
and more problems that are hard to fix. I wonder if
the overall quality of computer software would improve if there
were a way of forcing all software to be replaced after some
period of time.

Facts precede concepts

A fact is a piece of information that can be observed or measured;
a concept is a general rule that can be used to predict many facts
or a solution to many problems.
Concepts are powerful and valuable, and acquiring them is the goal of
most learning processes. However, before you can appreciate or
develop a concept you need to observe a large number of facts
related to the concept. This has implications both for teaching
and for working in unfamiliar areas.

In teaching it's crucial to give lots of examples when
introducing a new concept; otherwise the concept won't make
sense to the students. Edward Tufte describes this process
as "general-specific-general": start by explaining the concept,
then give several specific examples to show where the concept does
and does not apply, then reiterate the concept by showing how
all the examples are related.

I also apply this principle when I'm working in a new area and
trying to derive the underlying concepts for that area. Initially
my goal is just to get experience (facts). Once I have a
collection of facts to work from, then I start looking for
patterns or themes; eventually these lead to concepts. For
example, a few years ago I started working on my first large
Web application. My goal was to develop a library of reusable
classes on which to base the application, but being new to Web
development I had no idea what those classes should be. So, I
built the first simple version of the application without any
shared code, creating each page separately. Once I had developed
a dozen pages I was able to identify areas of functionality that
were repeated over and over in different pages, and from this
I was able to develop a set of classes that implemented the
the shared functionality. These classes represented the key
concepts of that particular application.

If you don't know what the problem was, you haven't fixed it

Here's a scenario that I have seen over and over:

A developer is tracking down a difficult problem, often
one that is not completely reproducible.

In a status meeting the developer announces that the
problem has been fixed.

I ask "what was the cause of the problem?".

The developer responds "I'm not really sure what the
problem was, but I changed xyz and the problem went
away."

Nine times out of ten this approach doesn't really fix the problem;
it just submerges it (for example, the system timing might have
changed so that the problem doesn't happen as frequently).
In a few weeks or months the problem will
reappear. Don't ever assume that a problem has been fixed until
you can identify the exact lines of code that caused it and convince
yourself that the particular code really explains the behavior
you have seen. Ideally you should create a test case that
reliably reproduces the problem, make your fix, and then use that
test case to verify that the problem is gone.

If you do end up in a situation where you make a change and the
problem mysteriously goes away, don't stop there. Undo the change
and see if the problem recurs. If it doesn't, then the change is
probably unrelated to the problem. If undoing the change causes
the problem to recur, then figure out why. For example, try
reducing the scope of the change to find the smallest possible
modification that causes the problem to come and go. If this
doesn't identify the source of the problem, add additional tracing
to the system and compare the "before" and "after" traces to
see how the change affected the behavior of the system. In my
experience, once I have a code change that makes a problem come and
go I can always find the source of the problem fairly quickly.

If it hasn't been used, it doesn't work

This is one of the biggest frustrations of software development.
You design and implement a new feature or application, you test
it carefully, and you think you are done. Unfortunately you
aren't. No matter how carefully you have tested, there will
be problems as soon as QA gets their hands on it or someone
tries to use the feature or application for
real work. Either there will be bugs that you missed, or some
of the features will be clumsy, or additional features will be
needed. Sometimes the entire architecture turns out to be wrong.
Unfortunately, the problems come out at a time when you
are ready to move on to the next thing (or perhaps you already
have moved on), so it's frustrating to go back and spend more
time on a project that you thought was finished. And, of course,
you didn't budget time for this so the cleanup work causes delays
in your next project.

I don't know any solution to this problem except to realize
its inevitability and plan for it. My rule of thumb is that
when you think you are finished with a software project (coded,
tested, and documented, and ready for QA or production use)
you are really only 50-75% done. In other words, if you
spent 3 months in initial construction, plan on spending
another 4-8 weeks in follow-up work. One way to minimize this
problem is to get your new software in use as soon as possible.
If you can create a skeletal version that is still useful,
get people trying it out so you can find out about problems
before you think you're finished. This is one of the ideas
behind Agile Development.

Sometimes people just refuse to do the follow-up work: "It's
not my highest priority" or "I will get to it when I have time".
If you take this approach you'll produce mediocre software.
No software is ever gotten right the first time. The only
way to produce high-quality software is to keep improving and
improving it. There are 2 kinds of software in the
world: software that starts out crappy and eventually becomes
great, and software that starts out crappy and stays that way.

The only thing worse than a problem that happens all the time
is a problem that doesn't happen all the time

Not much to say about this one: it's painful to debug a
problem that isn't reproducible. I have spent as long as
6 months tracking down a single nondeterministic bug.
Conversely, in my experience any problem that can be easily
reproduced can also be tracked down pretty quickly.

The three most powerful words for building credibility
are "I don't know"

Many people worry that not knowing something is a sign of
weakness, and that if a leader seems not to have all the
answers they will lose the confidence of their team.
Such people try to pretend they have the answer in every
situation, making things up if necessary and never admitting
mistakes.

However, this approach ultimately backfires. Sooner or later
people learn the truth and figure out that the person never
admits when they don't know. When this happens the person
loses all credibility: no-one can tell whether the person is
speaking from authority or making something up, so it isn't
safe to trust anything they say.

On the other hand, if you admit that you don't know the answer,
or that you made a mistake, you build credibility. People
are more likely to trust you when you say that you do have
the answer, because they have seen that you don't make
things up.

Coherent systems are inherently unstable

A coherent system is one where everything is the same in some
respect; the more things that are uniform or shared, the more
coherent the system is. For example, a typical cornfield in
Iowa is highly coherent: every corn stalk is from the same
strain; they're all planted at the same time, fertilized at
the same time, and harvested at the same time. The world of
computing is also fairly coherent: most of the world's computers
run one of a few versions of Windows, and almost any computer
in the world can be reached using the IP/TCP protocol.
Human-engineered systems tend to be coherent.

Natural systems tend not to be coherent. For example, consider the
ecosystem of a wetland: there are numerous different species
of plant and animal sharing the same area, but behaving very
differently with complex interrelationships. The behavior of
the overall system is hard to predict from the behavior of any
individual in it.

Coherent systems often have advantages of efficiency, which is why
humans gravitate towards them. For example, it's easier to plant
the same seed everywhere in a cornfield, and given that some seeds
are better than others, it's more efficient to use the best seed
everywhere. It's also easier to harvest if all of the corn ripens
at the same time. It's more efficient to have a single operating
system running most of the world's computers: once a new facility
is implemented for that system, everyone in the world can benefit
from it. If there were dozens of different operating systems, then
new applications would have to be reimplemented for each of them.

Unfortunately, coherent systems are unstable: if a problem arises
it can wipe out the whole system very quickly. For example, a
new plant disease could quickly take out a large fraction of
U.S. grain production. Computer viruses are another example:
a virus that takes advantage of a bug in Windows can potentially
impact most of the world's computers. The U.S. stock market
exhibits a certain degree of coherency in the way people think
and trade, which results in huge swings up and down as investors
move en masse to buy the latest fad or sell when a recession
looms.

The incoherency of natural systems give them greater stability.
For example, a particular plant disease could probably only
affect a small fraction of the species in a wetland.