Spacefem (spacefem) wrote,

how to troubleshoot: discipline vs. knack

Troubleshooting is a beautiful art.

One professor in college said it "could not be taught", that you were either born with "the knack" or you weren't - there's a related dilbert cartoon. But once I started engineering I realized that there are definitely techniques that not only can be taught, but SHOULD and AREN'T, and if you think troubleshooting is something genetic you might be a STEM gatekeeper.

The other thing I've realized is that technical troubleshooting is closely related to the old fashioned Scientific Method we used to have on posters in our elementary school classroom. When did engineers decide we were "practical" and therefore not scientists? There are a few versions of the method, but most go along the lines of:

1) State the problem
2) Do your research
3) Hypothesize
4) Experiment
5) Analyze
6) Write down your results

In my engineering life, these translate really well to...

1) State the problem. Yes, that, you'd be surprised how often we don't get a good description of the problem, an airplane lands and the pilot gets off and says "this autopilot sucks". That is not a good problem statement. "We were flying along straight and level and suddenly the autopilot pitched us straight down at the ground, spun us into a barrel roll then disconnected and called us names" gives us something to work with.

2) Do your research. When I'm overwhelmed my favorite thing is just to start printing stuff out. Someone says there's an issue with a weird part I've never heard of? print the spec sheet. print the wire diagram and get out my colored pencils. History can be an extremely important part of research. When did this problem start happening? What changed? These are especially important when chasing down intermittent issues, the dreaded "could not duplicate" that keep us awake weeks after the event.

I have a silly step 2a during this phase and that's "stay hydrated". I realize this does not sound technical at all but really it only takes a second and the benefits of drinking water help with so many other things, you're dooming yourself if you can't channel the necessary mental energy into a task for some silly physical reason.

3) Hypothesize/experiment/analyze - these can go pretty quickly together in troubleshooting. Research gives you your hypothesis... you don't think about ways a system can work, you read the way the system SHALL work. Then you can experiment. My favorite metaphor is the joke artists make, that to carve an elephant you start with a big slab of marble and chip away everything that's not an elephant. In troubleshooting, you find little parts of the system that are working, and eventually get yourself to the bit that's not working.

Of course in engineering we do have some trusty go-to experiments:
3a) Make sure everything is plugged in
3b) Try turning it off and then back on again
3c) "Percussive maintenance"

4) WRITE YOUR SHIT DOWN OMG! Engineering schools and math classes try to get students to show their work but it's never enough. Write down the exact results! Not "the resistance was within tolerance" but EXACTLY what it was, in ohms, in a table, forever. Then I knew you checked it. A lot of troubleshooting is done in teams where we want to trust each other but we've all learned from experience to never trust anyone. "Believe half of what you see and nothing that you hear," said a favorite specialist I work with.

I'd like to add another important last step... accept your paths. Never beat yourself up. If an issue took four days to solve, be happy it didn't take eight. Even if it's a tiny "obvious" silly thing, and it frequently is, and those are the ones where we feel the worst. At the end of the day the important thing is whether you learned something new, stuck to the problem and found the answer.

I am convinced now that a "technical person" is not someone with the right genes, just perseverance. We find a starting point even if we've already found 500 starting points that didn't work. Our job is to never run out of ideas. We don't freeze up. When things go badly we can try another approach, take a break, or ask for help. The best "troubleshooters" do not have a divine power to lay their hands on a machine and heal it. They know a LOT, so they don't have to spend as much time doing research to understand expected results we should get from expected inputs, and that's great. Maybe they've got a bank in their head of past issues, and that's great too. But we can all get there.

Be thoughtful. Ask questions.

(Stay hydrated!)
Tags: technology
  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded