Logging: Exploratory Tester’s Friend

I’m on a new project lately, working with a team at QualiTest. We’re testing a class III medical device. This is an exciting project, because for the first time I am aware of, formalized exploratory testing will be used to do such a validation. We will not rely on masses of procedural test scripts. I’ve been called in on this project because I created the first published formalized ET process in 1999 (for Microsoft), and created, with my brother Jon, session-based test management, which is basically a general form of that Microsoft process.

The QualiTest team consists of senior testers hand-picked for this job, who have regulatory testing backgrounds and an enthusiasm to use their brains while they test. On top of testing well, we have to document our testing well, and trace our testing to requirements. Automatic logging is one of the tools that will help us do that.

I am amazed at how crazy nuts some people get over documentation– how they sweat and shiver if they don’t have a script to cling to– and yet they don’t spare a thought for logging. Logging is great for testers, programmers, and technical support. Logging is automatic documentation. Sing the praises of logging.

I’m talking about function-level logging built into the products we test.

If you test a web app, you already have this (the web server and application logs, plus the use of a proxy to log locally, if you want) or would have it with a little tweak here and there by the programmer. For desktop apps, the programmer has to build it in. Here’s why he should do that right away:

Instead of following a script written weeks or months ago by some over-literal, function-besotted and data-blind intern, the tester can think, explore, play, and maintain the thread of inquiry without worrying that you won’t know what you tested, later on.

Instead of remembering what you tested, the product tells you how you tested it. Process the log with a simple Perl script, and you can potentially have an automatically generated test report.

Instead of just wondering how you made that crazy bug happen, the developer can consult the log.

Instead of asking the customer what he was doing moments before the crash, he asks for the log.

If logging is built into the base classes of the product, very little coding is involved.

This idea first occurred to me in 1993, after hearing from John Musa about how his telecom systems would “phone home” with data about how they were being used, but I couldn’t get a programmer to put logging into anything I tested until I was at SmartPatents in 1997. Since then I’ve helped several projects, including a couple of medical device projects, get going with it.

On this most recent project I was asked to create requirements to specify the logging. Here is the generic version of what I came up with:

2. The results of any diagnostic self-tests or assert failures shall be logged.

3. Any function should be logged, regardless of user action, that causes a change to data, screen display, system configuration, modes or settings, communicates with other equipment, or produces an error or information message.

4. Everything that could be interesting and useful for testing, support, and system debugging should be logged UNLESS the event occurs so frequently (many times a second) that it poses a performance or reliability risk.

5. Each log event shall include at least the following information: – Time stamp: For instantaneous events, time stamp (millisecond resolution). For events over time log the start and stop times by logging it as two separate events (e.g. “Event START”, “Event END”). Events that set a persistent mode or state can be logged as one event (“high security mode ON”) but the state of any such modes shall be automatically logged at startup and shutdown so that a complete record of that setting can be maintained over time. – Event type ID: always unique to event type; IDs not re-used if an event is retired and a new event is created. – Event type description: short, unique human readable label – Event information: any data associated with the event that may be useful for customer service or assessing test coverage, this data may be formatted in ways specific to that event type.

6. At startup and shutdown, the current settings, modes, and confuguration shall be recorded to the log.

7. Any errors shall be recorded to the log, including the actual text of the error message.

8. Every type of loggable event shall be stored in one table in the source code or in a data structure accessible on the system itself, such as a header file, enum, array or resource file. This facilitates providing the validation and customer service teams with a complete list of all possible events.

9. The log format shall be in text form, structured and delimited consistently such that it can be parsed automatically by a third party tool. The data for each event should be on one line, or else be marked with standard start and end markers.

10. The log format should be structured and delimited such that it is reasonably human readable (such as tab delimited).

11. The level of detail included in the log file should be configurable in terms of preset levels: 1- error and service events only, 2- Functional events, error events, service events, 3- All events including diagnostic information messages about internal states and parameters.

12. The log should behave as a ring buffer with a maximum of X events (where X is configurable within some limit that would not be exceeded in 7 days of heaviest anticipated use). If the size of the log exceeds available space, the oldest events shall be discarded first.

13. When the log is exported, it should have a header that identifies the software version (and serial number of the HW, if applicable) and current configuration.

Strangely, I’m also in the medical device business and I’ve been looking for something like this for years.

For item 8 (listing every type of loggable event) I remember hearing once about using the localization system to assist in generating a list of loggable events and error messages. If all log messages and error messages have identifiers that begin with “Log_” or “Error” it should be straightforward to extract them all and where in the code they get hit. I haven’t yet had a chance to follow up on this idea yet. Do you know of anyone who’s looked into this?

[James’ Reply: This is why, in the requirements, I said they should be stored all in one resource file or header file. They are easy to review, hand over, or translate. However, they could simply grep through the code looking for the logging calls, assuming they are all made through a consistent mechanism.]

When I worked as a tester in the Telecom industry we had a requirement that every error/warning message displayed to the user contained a unique error number. This number was composed of the ID of the module/component where the error was generated and a sequence number.

This was extremely powerful when investigating bugs and tracking the actual location in the code where this error/warning was generated.

We actually came to an agreement with our development team that when a specific error number was used more than once in the source code, the full SW package to be tested was returned to development immediately.

I think this is a great idea that I’m also pushing for on my project. I’ve had mixed luck with logging in the past.

I just came off of a project that had logging that actually hurt the product. The project was on Windows and the ETW logging framework was what we used. It was supposed to just have “2% overhead” (yeah, I hear you “2% of what?”, I asked that too…). In practice the overhead was significantly more. Also, some logging statements had bugs in them that lead to program instability. This isn’t a counter-case against what you are arguing for, just a cautionary tale.

[James’ Reply: Excellent point. The first time I tested very heavily with logging we spent a week chasing a bug whereby a process on the server started (as noted in the log) but never finished (the event end was not logged). Turns out it was a bug in the logging code triggered when events happened so quickly that the thread tried to open the log file while it was still in the process of being closed.]

Are you using a logging framework, or just using built-in facilities in the language/platform you are developing for?

[James’ Reply: I don’t know how the developer will do it. I don’t know much about logging libraries, but I suppose they must exist and have some cool features. All we need is a formatted print to a file. The challenge comes in when many processes are trying to write to the same log, or when speed is a big concern. Maybe an established logging library would help with that.]

Do you think you are “lucky” to be working in a regulated environment where this type of proposal might have a better chance than with less safety critical software?

[James’ Reply: Yes.]

Was there any feedback from the devs on the proposal?

[James’ Reply: Not yet.]

Do you have any more to say about the level of detail in the logs?

[James’ Reply: I love details. I want data about the events, not just the events themselves. In testing a search system I had the developers dump the SQL of for each search, and from this I built an automatic search test matrix. Of course, there has to be a way to deal with the huge size of the logs. That’s why I like being able to set debug levels.]

One conflict that came up between dev and test on my project was that some of the devs wanted a log that would be detailed enough to use to debug with. Testers just wanted “important” events. The “debug” level logs could be 100’s of MB for just a few minutes of product use. It was a major discussion when I made a proposal to adjust what gets logged at what level later in the project.

[James’ Reply: Yeah, that’s an issue. Of course, you can just filter the log. So it’s not that big a deal unless the log is super duper huge. I guess 100’s of MB would qualify as that. At that point the issue is intrusiveness– the log may actually reduce testability by causing a major performance problem in the product. Obviously, you’d have to get smarter about logging or else build a special infrastructure to handle it. Seems to me that super detailed logging might be built on some sort of event trigger, so that it’s not going all the time, but rather only when you hit a panic button or when certain warning events occur.]

What about using a recorder every time we do exploratory testing? We can always throw away the resulted scripts if the session turns out “fruitless”. This doesn’t cover many points from the above list, indeed, but it doesn’t require extra effort from the dev part and it works with existing applications.

[James’ Reply: It depends what you mean by “recorder.” When I do web testing I routinely use Burp Proxy to record everything that goes from the browser to the server. (I have also tried using the Selenium recorder but found it disappointing. It doesn’t record very much.) I use line sniffers when and where feasible. I have used screen recorders, key loggers, and even a little digital camera. There are pros and cons to each approach.

What rarely works, except in a stateless command-line environment, is record-playback automation. There are so many things that can mess that up, you end up fiddling with it all the time.]

do you think the title of the post is aptly named ? “logging : exploratory tester’s friend ” ?

are you trying to suggest that ET is better because only ET uses logging help ?

i would say logging is a damn good support tool ( i would even say it is sometimes necessary) for any sort of testing that you are doing ? whether it is scripted,exploratory,automated ? and even during debugging

I especially like your initial discussion on how documentation gets to people, so they end up asking for something they often really doesn’t have any use for. I’ve seen it multiple times, and it’s a sad waste of everybodys time.

This also reminds me of back some twelve-thirteen years ago, when I was programming heavy calculations in SQL. After each step I created a complete table as a copy of all the data I was working on, so I could debug step-by-step. It made life so much easier.

Thinking about it makes it seem pretty ridicolous that we have gotten ourselves used to saying: what happened ? and not questioning why it couldn’t be answered… Just because it’s our job to play detectives in order to reconstruct what lead to a particular failure, it doesn’t mean that we should do it without the proper tools.

I have been working for couple of years for company building real-time embedded systems. This was big challenge to find the equilibrium between logging (volume of it) and real-timing (performance of embedded application). The main problem was resource sharing (either they were used for performance or for logging). In that case ‘logging’ does not seems to be so easy for implementation. Finally we find out that we need two levels of testing at least. One was ‘functional not real-time related’ second was ‘real-time related’ during which we didn’t use much logging (only critical points). After all we had ATP sessions that were run on non-debug version of software.

I worked as a performance tester in an SOA environment for a middleware organization. We had a system that was very similar to what you’re describing. It was a collection of Java web services and batch applications. Their logging system was heaven to me as a test manager. It maintained a consistent transaction ID throughout the system which made it much easier to get a better picture of what was going on as requests/responses were passed from service to service and across various message queues.

One thing that was nice was that the logs were not written to files, rather they were submitted via XML objects to a logging service which greatly enhanced the efficiency and was less of a drain on the system (as writing to disk is slower and more of a resource hog). Also, there was a configuration file which allowed us to set different logging levels for each environment (info, error, debug, etc…). The data/variable pairs would align with a global variable for whatever environment in which the code was deployed.

[James’ Reply: Cool. I have also used SQL Server transaction logs as log files, and you get those for free in database apps.]

I hesitate to say this, but we were a scripted organization 🙂 I hope we’re allowed to post here. [James’ Reply: You can post it. But what do you mean by it? Detailed procedural scripts? If so, why? You had logging, which is like having a GPS track of where you were patrolling, but you still felt that letting your drivers patrol as they saw fit was bad?]

While we may differ on certain aspects of testing, I agree wholeheartedly that logging (or the absence of) is one of the most overlooked gems for a tester. Monitoring and rigid examination of logs is something I try to instill into each member of my testing team. When engaged in a project in the planning and analysis phases, it’s also something I urge the architects to consider.

[James’ Reply: We don’t yet know if we differ. I do plenty of planning in my testing, when the situation calls for it. To be exploratory means to not STOP planning once you have a “plan.”]

If I may, let me add a 14th point, an insight I learned from a tester in my team: For functions the software does by itself, in addition to logging the result of each step after it happens (imported data; generated report; exported graph…) it is very useful to log *what the next step will be*. This way, when a failure occurs, it is easy to know what the software was trying to do.

Since I learned that, I am paying attention to it, and impressed by how much common logs or progress reports lack this. It takes a lot of time for an operation to happen, but you need to wait until the end of it to know which it was. Once my computer froze during Windows’ shutdown with the “playing logoff sound” screen. I’m pretty sure the sound playing was over, and was left wondering what caused the OS freeze.

Hope you approve the addition 🙂

[James’ Reply: Let’s say I’m a programmer. How would I actually implement this? Are you just saying you want me to log the start of events, instead of just logging when things end?

Obviously, I can’t know, from within a specific routine, what the next thing will be.]

The benefit is much bigger for progress/status reports (that are seen in the screen as actions happen), but it is the same for logs: There are times where an application will crash, but all the data the users/testers have is what was done until now. The developer can (easily or not) deduce what the app was doing when it crashed because he knows the flow — but the users are left with guessing.

Without this little aid, if function E crashes when called after D, but E is *not always* called, a log for crash would look like “A, F, B, D, ” while a log without crash would look “A, F, B, D, G, A”. Hard to see the reason, right? Had the testers know that function E was starting, the ‘sporadic’ issue would have been isolated much more easily.

James — we do this kind of logging in TestExplorer. We don’t cover all your points, although we likely will soon. 😉 However, we have three levels of logging, which are easily selected in the user options; the logs are written out to simple html, so anyone can view them; and each entry is time/date stamped and nested, so you can follow the flow as functions call into functions call into functions, and then back out. It has been one of the best decisions we ever made; the ability to simply have a client turn logging onto super turbo mode, recreate a problem, and mail the log to us continuously helps us understand problems as they emerge.

This could be a wonderful way to show reports to managers. Managers complain about “visibility” when doing exploratory testing. Logging everything and extract then the essential stuff looks like a way to me.

I keep on coming back to it time & time again. I give it to testers but usually I hand a copy to the developers. They complain at first and after they read it they go: “Hey, this is really a good idea….”.

Maybe you should re-work this into something more prominent & permanent on your site. I think it adds a lot of value.

I also find logging especially interesting from an automation perspective. I can have watchers on a log, that alert me if certain conditions that I want to know about occur (i.e. see if a defect has actually been resolved if there is doubt about the fix).

Thanks, James! You have raised nice topic based on “logging software testing”. This post is very relevant. I wish all testers read these with awareness excellent carry out. I would like to include following 2 points.

The best practices for Logging Software Testing Bugs • To take image of every step in the process. [James’ Reply: That is not a best practice. If it’s a practice you like, just say that. Meanwhile, you should talk about what situations would not benefit from this practice.]

• A lot of tools automate a number of features of the procedure, which serve up to add to effectiveness and decrease costs. [James’ Reply: I don’t know what that means, but it can’t be a best practice, since there is no such thing as a best practice.]

This might be a magnificent method to demonstrate reports to managers. Thanks again for this post!