Is Troubleshooting an Art or Science?

“What do I always say? Anyone can cook.”

I am often reminded of this quote from Disney Pixar’s Ratatouille when I think about troubleshooting. But is it true that anyone can troubleshoot computer problems or debug code?

The Troubleshooting Process

Much like cooking, troubleshooting follows a simple recipe that anyone can follow. Troubleshooting can be broken down into 4 steps:

  1. What are the symptoms?
  2. Can the problem be reproduced?
  3. What changed since the last time the system worked?
  4. Find a solution to the problem.

1. What are the symptoms?

This step will usually be completed by the end users of your application – their complaint will be the catalyst in starting the troubleshooting process. You want to identify any symptoms that are occurring, to help narrow down the problem:

  • Is the application crashing?
  • Are you receiving an error message?
  • Are you getting unexpected results?

2. Can the problem be reproduced?

Sometimes an error can be a one-time occurrence (these can be a little bit more difficult to troubleshoot) – such as a connection to an external database is not available temporarily, but after waiting an unspecified amount of time the system works correctly.  Sporadic errors are more difficult to troubleshoot as there might be several things wrong (at the same time) and may require a problem process of elimination.

For other issues, you want to identify a predictable pattern to the error occurring. For example, if I enter a zero into this field and press calculate, I receive a divide by zero error message.

3. What changed since the last time the system worked?

A program should never change its behavior on its own (unless you are programming A.I.). Something must have changed somewhere in the system, it could be code related, hardware related, operating system related, etc. The change may resulted from something that was done unintentionally, such as Windows installing automatic updates or malicious code.

Many applications and operating systems will have logs that will identify any that may have happened, these will be your primary source for determining changes.

4. Find a solution to the problem.

This is the step where troubleshooting becomes more of an art.

Steps 1 – 3 are really a discovery process – the answers can usually be obtained by asking end users some basic questions.

Occasionally, the solution to a problem is a relatively simple fix and the troubleshooting process ends here. If the solution is not inherent after the discovery process, research needs to occur.

The difficulty lies in knowing where to look. This, unfortunately, is something that will only come with experience. One of the first things that new programmers confront is, “how do I know what data types are available or which one to use?” If you look at it, different variations of this question can be used no matter what profession you have:

Chef: “How do I know what ingredient to use?”

Doctor: “How do I know what to prescribe for those symptoms?”

Baseball Player: “How do I know how to swing the bat for the most impact?”

After you’ve tried a solution, did you fix the problem? Troubleshooting may become a recursive process – because you may inadvertently introduce new errors and will need start the troubleshooting process again finding a solution to the new problem.

The good news is there are plenty of resources (see 17 Websites for Sharing Programming Knowledge) where we can seek out expert advice and find answers to our questions.

Experience is what sets apart the beginners and the masters.

Whether it’s cooking or troubleshooting, anyone can do it – However, each person will have varying degrees of success depending on the experience level.

Photo courtesy of / CC BY 2.0

Tagged with: , , , , , , , ,
Posted in Technical
One comment on “Is Troubleshooting an Art or Science?
  1. SteveC says:

    Ages ago, when learning the ropes of SunOs system administration, I was told something which has stuck with me, and served me well, many, many times. It is similar to your item number 3.

    It is this: “When you really get stuck, and something isn’t working that once worked, try to find something similar to it that still works. Then gradually change the working system, one thing at a time, to be more like the broken system.”

    Or, in short, compare the working system to the broken system.

    This is the fundamental idea behind, for example, git-bisect, but the concept is more general, and extends to hardware, auto mechanics, anything. It’s really a part of science. That is, in science, people make hypotheses, do experiments to try to corroborate, and to disprove those hypotheses, and peers try to duplicate the experimental results. Just because the answers to questions aren’t straightforward doesn’t make something art and not science. The bleeding edge of science consists of pretty much nothing but questions to which the answers are anything but straightforward.

Leave a Reply