I Can’t Reproduce That

Sorry, I can’t reproduce that.

I hate saying those words. It seems like a cop out, like I am giving up on the problem. How do you tell someone that you can’t reproduce their issue? Do you ask for more details? Do you just leave it up to the user because you think it is an isolated user issue?

I always ask for more info. The problem is sometimes I don’t know what to ask for. On a web app, if I don’t get an error log I have to ask for browser info from the user and this is an exercise in and of itself. I need to be aware of browser plugins, state of cache (last time cleared), the user and steps used to trigger the issue… and more. After I get the basics necessary for most browser based issues, I start asking more questions specifically related to the scenario that is bombing. Here in lies the problem. Asking users for info usually results in surface data that don’t help get down to the nuts and bolts of the problem. Also, I don’t have a test lab readily available to me so spinning up a browser configured like the user is not possible.

It is hard to get the info you need without stepping through a debug session to see what is going on. I sometimes have to result to adding additional instrumentation to code to hopefully get better error messages or state info, but this requires a push to production and the user having to wait until all of this is done. So, how can I analyze an issue without this information?

I know there are solutions out there that will record live user sessions in production and even record the stack trace down to the method that caused the issue. Unfortunately, I don’t have access to any of these tools. I guess I will have to just get better at asking questions that even tech illiterate users can follow.

Thanks, excellent suggestion. I work in a .Net stack. I am aware of a few tools that can be used to make this much simpler, but unfortunately I am beholden to corporate rules and regulations and right now there is no way I can get something like that pass operations and into production…even if it is free.

Do you have a post about the monitoring tools you use and do you use them in the context of DevOps?

One of our admins was showing me some of our uses of Zabbix today. Seems quite good for handling incident response, including automated script response. For example, we use it to watch the growth of a particular log file and restart a known flaky service (until we can track down the root cause) This is in production. We are also looking into other tools. Have am trial of AppDynamics, which seems quite useful. May also look into NewRelic and potentially other solutions.

Thanks again for the info. In my previous job we used a tool called DynaTrace for application monitoring. You have given me a little motivation to try to champion an app monitor at my current job. Let me know when you have that post. I’d love to read about your findings.