Testing Toasters

Hot on the heels of my last post that my blogging was moving, I’m right back here with a test focused post.

Something hit me again yesterday, and not for the first time, that I had to call out.

Let’s talk about building a quality user experience and compare this to where I see folks spending most of their time – building a quality *functional* experience.

When I think about a quality user experience, I think about a feature that plays well with others and is trivial to understand and use. So that’s easy to write down and it’s easy for folks to agree that building such an experience is important. But what’s hard is detecting when you don’t have a quality user experience.

As I’ve commented before, you have a number of elements working against you here. Some of these are psychological. When you work with a feature day in and day out, you becomes used to its warts. You figure how to work around them and this workaround mentality becomes ingrained so that it’s second nature. You become blind to detractors of a quality user experience and you’re surprised when someone else tries your feature and struggles to use it. I wouldn’t be surprised if your work environment is also working against you. It’s been a hectic week and you have lots of deliverables like test plans and automation. You’re not thinking about the optimal user experience, because you’re swamped just trying to cover basic functional testing.

You need to change and make a mental shift to wearing that proverbial customer hat and evaluating the user experience. When I think about a feature, the first thing I want to know is the user experience we’re trying to nail. I don’t care about technical implementation details

This is a key tenet of Scenario Focused Engineering (SFE) which we push heavily at Microsoft. The primary idea being that the customer is put at the center of the development process and the focus is on delighting the customer. Notice that we’re not talking about engineering details of “how” this is going to be done. Our users don’t care about that. The technical challenges you had to overcome to deliver your feature are irrelevant to them. That’s a key point. We often get bogged down in technical details around implementation or integration with other features and this causes us to miss seeing the forest because of the trees. We need to clearly define the user scenarios we want to deliver, avoid getting bogged down in technical details, and constantly check ourselves against delivering those uncompromised user experiences. This is what Apple does so well these days and the results speak for themselves.

With a clear understanding of the ideal user experience you want to deliver, assess your feature against this. This is the hard part for the reasons I previously noted. So much is working against you here and it takes training to stay focused to produce the best assessment.

Let’s go through an example showing an integration scenario between an Azure web site and Hosted TFS which will be used for source code control. I thought a video would be best, so here we go:

*** Sorry, either your browser does not support the HTML5 video tag, or I don't have the movie in a format appropriate for your browser. Still, you can directly download the video.***

Now that you have ideas around how the actual user experience differed from the ideal, you’re probably ready to go back to the developers to discuss this. Often, the gaps between the ideal from the actual are caused by technical challenges (i.e., stuff is just hard to do). That’s understandable, but it shouldn’t always excuse us from tackling those problems. That’s why we’re paid the big bucks, right? We’re able to solve these challenging problems. We need to fight harder to deliver awesome user experiences. Just nailing fundamental functionality isn’t enough. You have to go the extra mile. Customers are worth it.

My day to day work at Microsoft has changed in the past year such that I now focus more on testing via development using Microsoft’s Cloud and Web technologies. As part of this change, I want to share more content that’s specifically oriented to development. Hence, I’ve started a new blog with a few other co-workers at www.cloudydeveloper.com. So head on over there and check us out!

If I do bump into some interesting test focused material, I’ll still likely share that here.

I’m a visual thinker. When collaborating with another tester, I always go to the whiteboard to draw up the system we’re testing. Pictures help me understand. Just listening to someone for ten minutes while they describe all of the pieces, communication, data flow, etc. in a system causes my brain to melt. I need to write things down to *see* what they’re describing. That’s the way my brain works. I’m probably just getting old.

Anyway, I think this is a useful process so I’ll share it.

Taking a step back, this process helps me because it engages multiple parts of my brain simultaneously. Instead of being a passive listener while someone explains a system, I’m actively engaged. My hands are drawing while my ears are listening. In drawing, I’m focused on identifying the critical pieces of information I’m being told (that’s what I draw) while ignoring extraneous details. This makes me focus on what’s important.

For those involved, drawing on the whiteboard also gets everyone on the same page. If I’m explaining software interactions to you, I’m not certain you’re understanding me. Granted, you might ask for clarification here and there, but maybe not always or you might make incorrect assumptions in your head. Putting our thoughts on the whiteboard helps ensure we’re on the same page because we can see what we’re thinking. If I draw two components and join them with an arrow indicating data flow, you can correct me if I’m wrong.

Drawing also helps identify what we don’t know. In a way, we’re teaching the system to each other while we’re drawing. There’s no fudging of information here. Lots of questions are being asked. When we can’t explain how part of the system works, we’ve identified a gap in our knowledge. Perfect. Now we know that we don’t know. Time to figure it out.

Okay, enough hand waving. Let’s go through an example using NuGet. Never heard of NuGet? No sweat. It's a package management system. Read more about if you’re interested. I’ll wait.

First, a bit of NuGet background to set the stage. If you haven’t read a bit about NuGet, none of this will make sense, but bear with me.

When you add a NuGet package to a Visual Studio project, NuGet adds the files for the package (e.g., a dll) to your project and additionally copies the files to disk at the Visual Studio solution level which is a few physical directories up the hierarchy. It’s like making a backup. But what if you don’t want the copy. Maybe your constantly forgetting to check them into source control, and when your buddy loads up your project on another machine, it fails to build because those files are missing.

Wouldn’t it be nice if NuGet was smart enough to download your packages again. Surprise! NuGet v1.6 has this feature!

When first explained to me, I mentally drew pictures, but when you’re pouring a gallon of knowledge into a shot glass of a brain, knowledge spill out.

I stopped the explanation, picked up a marker, and went to the whiteboard to produce this (yes, my penmanship is laughable):

On the whiteboard we see:

Web applications (App A and App B) in the upper left corner that are part of the same VS solution.

A “packages.config” file which records the NuGet packages used by an application.

In the upper right is the local package cache (PKG CACHE) that’s queried first during the project build phase.

In the bottom right, is a cloud representing NuGet feeds (NuGet Src). Package bits are downloaded again if they weren’t found in the package cache.

Between each of these are arrows with notes indicating operations. For example, between the web applications and the “packages.config” file we have a “read” operation. That’s Visual Studio looking up the packages used by a project. Next, Visual Studio checks the package cache to see if the package bits are available on-disk. If that fails, we download the package again.

While drawing this system, we came up with questions (see the ‘?’s) regarding functionality and design.

For example, we wondered what would happen if downloading a package required authentication (i.e., a username/pwd)? How would you prompt a user to enter these during a project's build process. Would a blocking prompt pop-up? What if the build was triggered via the command line? There’s no UI shell to work with there. Maybe we need a way to pre-define these credentials?

One question you might be asking is “Where’s the developer design spec?”. Wouldn’t that answer your questions? The answer is the classic “it depends”. We’ve become more “agile” with our development and I now see less formal design documents and specs. We use wiki’s heavily and the level of detail is sometimes low. Let’s sidestep the debate on whether this is good or bad. The bottom line is that in the absence of such documentation, drawing out the system is one way to help you identify what you know and some of what you don’t know.

Long story short, drawing the system under test is helping me raise design questions and identify interesting test scenarios at a low cost.

Still not interested in this approach? Grab a fist full of whiteboard markers, remove the caps and inhale deeply. I’m certain you’ll see my point eventually.

I’m becoming obsessed with error messages. Working with web development frameworks and related tooling, I bump into lots of these, and as I’ve written before, it’s easy for them to go wrong. It’s also easy to brush off their imperfections and simply continue on my way. By arming myself with this checklist, I’m less likely to do that. Here’s my new error message testing checklist:

Does the error message point to the right problem?

This one’s a no-brainer. Simply put, does the error message match the error. This one points nowhere.

Is the error message out-of-date?

We often re-use error messages across multiple versions of a product, but it can become out-of-date. For example, we might have an ASP.NET error message might reference a version of IIS, but that version of IIS might differ than the one shipped with a newer OS.

Is the error message easy to read?

A smaller than usual font might be used to pack a lot of information in a small amount of space. Or maybe there’s a poor choice of formatting. For example, you have a single large paragraph that could easily be broken up into easy to read bullet points.

Is the error message content appropriate for the setting in which it’s encountered?

Stack traces are often helpful, but only in the right context. An ASP.NET error stack trace is often valuable since it tells the user how *their* code triggered an error (e.g., passing a null value). This stack trace, presented in Visual Studio, is a counterexample. Maybe it’s an exception that should have been caught and re-thrown with a user friendly error message. Anyway, you get the idea.

Should the error message include a link to more information?

If you need to display a lot of content and have limited space, you could include a link to a web page that has more details. This has an additional benefit since you can update the information web page without changing any product code. Here’s an example.

Can the error message be easily copied?

In the dialog above, it’s not easy to highlight the link for copying. The trick is to select the dialog and press CTRL+C to get the contents. Probably not obvious to users. Making the link clickable directly from the dialog would have been event better.

Does the error message provide overwhelming detail?

Instead or providing individual error messages for individual error conditions, sometimes a single error message is used for a collection of error conditions. Here’s the error message when an ASP.NET automatic database creation step fails. It covers a number of possible reasons for the failure. I’m not saying this is a bad error message, but it’s a possible example of overwhelming detail.

Is the error message ambiguous?

Looking at the error message below, which operation is it referring to? I might have specified more than one.

Is the error message localizable?

Showing the same string regardless of OS language would be a mistake. It needs to be translated. With the .NET framework, we pull the text of error messages from resource files that are localized into different languages.

Does the error message require additional context to be understood?

This one was triggered only if in a previous dialog the user targeted their local machine for the “deployment action”. Looking at this error message in isolation, the connection isn’t obvious.

Showing an error message to a co-worker unfamiliar with a feature is a great way to catch this.

So there you have it. A set of some additional checks you can use to better asses the quality of your error messages. Now if we could only prevent users from causing errors in the first place…

Geez Dilbert. Must we perpetuate the stereotype that QA engineers (or testers) are intellectually inferior to developers.

Here’s the strip for 10/28/11:

While I don’t doubt that there are QA jobs that require little brainpower, many folks seem quick to generalize and lump *all* QA jobs into this bucket. I even get a whiff of this when interviewing external candidates for testing positions at Microsoft.

Candidate asks - “How much time will I spend coding?”

I translate this into - “How little time will I have to spend doing inferior QA type activities?”

Hopefully you read the previous blog post which set up (no pun intended) the context of this second follow up post. If not, take a few minutes and read it before continuing here.

With your matrix of setup configurations fully defined, it’s time to turn our attention to executing the individual scenarios in an effective and efficient manner.

The first thing to do is check your set of test configurations and double check that there aren’t any in there that are invalid. For example, maybe there are some operations systems you mistakenly assumed your product would support, but it turns out that with this version, you’ve dropped support for some of the older ones.

The next big task is to prioritize the setup configurations. Assuming you have neither infinite time or resources, you need to selectively pick the configurations you can run. Here, having a good sense of your customers’ common configurations help. Hopefully you have some historical data on this or maybe it’s something you can collect it via a survey. Otherwise, you’ll just have to rely on your best guess of the likely configurations. If that’s the case, you might want to enlist a pairwise testing tool.

If you’re not familiar with the concept of pairwise testing (or all-pairs testing), it’s a way to reduce the full combinatorial matrix of your variables down to a smaller set that will likely provide “good” test coverage. It’s a cost-benefit tradeoff that might be worth exploring. Let me provide a quick example.

Imagine that your dealing with three variables for your setup test plan and each have two values:

A pairwise analysis of these variables would only yield 4 setup configurations, but would ensure that each pair of variables would be covered at least once. So you might get something like:

Windows 7, 32-bit, Basic Edition

Windows XP, 64-bit, Professional Edition

Windows 7, 64-bit, Professional Edition

Windows XP, 32-bit, Basic Edition

You can see that “Windows 7, 32-bit, Professional Edition” is not in the list. Breaking this configuration down in variable pairs, you’ll get coverage for “Windows 7, 32-bit” in configuration #1, “Windows 7, Professional Edition” in configuration #3, and “32-bit, Professional Edition” in configuration #4.

Now, with your setup matrix streamlined and prioritized, you can finally get down to testing. The question though is “who” should do it? In my opinion, this is a great task to be offloaded to less costly resources like vendors or offshore teams. We do this in my group, and it works well, but only because we’re crystal clear and detailed in our communication. Notation is standardized so there’s no confusion on what each setup test definition means in terms of products, install/uninstall order or verification steps. If there’s a lot of back and forth between you and the other resource, you’re not going to save much time or effort.

After running a setup test, you need to verify “success”, but what is this verification process? Are you simply verifying that the setup process ran end-to-end without crashing, or is there more to check?

I hope you answered “there’s more to check”. At a high level, you should at least be evaluating the following:

A sanity customer scenario can be completed after the product is installed.

If there are other apps that might be affected by the setup, verify they haven’t been broken. We do such checks when testing SxS setups of Visual Studio. An install of the latest version shouldn’t break the previous.

If testing uninstall as well, verify you haven’t left any “turds” on the box. This could be files, registry entries, etc. Doing so is just sloppy and might break a future install of a new version of your product. Using tools to take a snapshot of the file system and registry for comparison purposes can help here.

Assess the speed of the setup install. Is it reasonable? Is there a UI providing meaningful feedback to the customer so they know it’s proceeding successfully?

Okay, so those are my thoughts on setup testing planning, execution and verification. In closing, I’ll paraphrase my primary message from the previous post. Your product is useless, if your customers can’t successfully install it on their machines. Please don’t screw this up. :-)

We do a lot of setup testing at Microsoft. Given the wide distribution of our technologies across a massive matrix of user machine configurations, we can’t afford to skimp on this testing and only cover a few mainline combinations and hope for the best. We put a lot of thought and effort into setup testing.

In this first post of what I think will be a two part series, I’ll focus on setup testing planning. How do you create the test scenarios that go into this type of test plan. In a subsequent post, I’ll dive into execution and verification of the setup test plan. Now the way I approach setup testing is heavily influenced by my experiences at Microsoft, which shouldn’t be a surprise, but I hope there’s some generic points below that can be applied to almost any product.

Let me take a step back and define the purpose of setup testing. Simply, we’re trying to install a product on a machine and verify it works as expected while having no negative impact. That last part is sometimes inadvertently omitted, but you’d be embarrassed if your product’s install broke other products on the user’s machine (unless they were competitors).

Okay, let me take another quick step back (I’ll move forward soon) and describe the motivation for discussing this topic. Up until now, every time I’ve been asked to develop a setup test plan or review one, I’ve primarily relied on my memory to come up with the scenarios. This is idiotic. I’m biting the bullet here and spending a bit of time to document my approach. No more solely relying on you squishy brain.

Moving ahead, the first step in setup test planning is to define the variables that form the different configurations we need to test. We’re thinking about all of the moving parts in the system that could impact whether setup succeeds or fails.

Let me try to define a semi-structured list of variables to consider and questions to answer.

Architecture – Does your product run on both 32-bit and 64-bit machines?

“Dirty” machines – What other bits might installed on a customer’s machine that could interfere with your products setup (e.g., aggressive anti-virus software)? Similarly, make sure you’re not *always* testing on brand new clean machines.

Screen size – If you' have a UI based installer, run it on a monitor with a low resolution (think netbook). Does the UI scale properly or does it get cropped?

Disk space – Run the install on a disk that doesn’t have enough free space for the install to succeed.

Product configuration:

Versions – Have you shipped previous versions of your product? Do you plan to ship future versions beyond the current version you’re working on?

Future versions are an interesting test scenario. You’ll want to ensure you’re not doing something today with your current setup that will cause you setup pain in the future. For example, not writing a registry key which indicates the version of your product installed. You might want a future version of your product to have this information.

When we test our frameworks, we often build a “mock” vNext-Next version (that’s two versions ahead) and do setup testing with it.

Side-by-Side (SxS) – Can your product be installed SxS with other versions of your product? If so, which combinations are supported?

Install/Uninstall configuration:

Different Installers - Is there more than one way to install your product? Maybe you have an interactive UI based installer and an automated command line installer. Or maybe different installers for different OS’s?

Location of installer – Does it matter if the installer is run locally versus on a remote network share?

Install location – Is this fixed for the user or can they customize it?

We’re sometimes nailed by this because all of our test machines have the same setup of two drives and when a customer installs the product on a machine with just one drive, we bomb.

User accounts – Does your product have to be installed by an administrator level account or will a non-admin account work as well? Can one user account install the product and another user account uninstall it?

This scenario of one user installing and another user uninstalling is interesting for shared servers where the first user account is deleted at some point.

Order of operations – If SxS configurations (see above) of your product are possible, vary the installation and uninstallation of them. For example, you could [install v1, uninstall v1, install v2] as one configuration or [install v1, install v2, uninstall v1] as a second configuration. Both should produce roughly the same result.

Blocking – Are there configurations where an install or uninstall operation should be blocked? Maybe installation shouldn’t be possible if there’s a previous version of the product already on the machine, or you can’t uninstall because other products have a dependency on this one. Related to that, if the operation is blocked, is a meaningful message presented to the user instructing them on how to become unblocked?

Miscellaneous:

Dependent products - Are you assuming that some components must already be on the box?

I’ve seen my team make assumptions that IIS will always be installed before we try to install ASP.NET because that’s the way our test machines are set up, but there’s no guarantee of this.

Files in use - Are there files that setup needs to access that might be locked by another process?

Microsoft products seem to often recommend exiting all other applications before installation begins to help ensure there’s no “file in use” conflicts.

Right bits - Make sure the build you’re testing is the one you’re shipping. If you’ve done some testing and the bits change, re-evaluate the setup test scenarios you need to re-run.

With the configuration variables defined and the various questions answered, it’s time to crank out the matrix of individual configurations to test. In a follow up post, we’ll continue with setup testing execution and verification.