CSC207 Software Design
Lectures
Systematic Debugging

What It Is

Finding and fixing flaws in software

Assume for now that you built the right thing the wrong way

Requirements errors are actually a major cause of software project failure

You're going to spend half your professional life debugging

After all, once it's fixed, you move on to something else

So learn how to do it well

Rule: never debug standing up

The faster you try to work, the slower you'll go

Rule #0: Get It Right the First Time

The simplest bugs to fix are the ones that don't exist

Design, reflect, discuss, then code

A week of hard work can sometimes save you an hour of thought

Make good style a habit

Train yourself to do things right

So that you'll code well even when you're tired, stressed, and facing a deadline

Rule #1: What Is It Supposed to Do?

First step is knowing what the problem is

"It doesn't work" isn't good enough

What exactly is going wrong?

How do you know?

Requires you to know how the software is supposed to behave

Is this case explicit in the specification?

If not:

Do you have enough knowledge to extrapolate?

Do you have the right to do so?

Write it down!

Preferably away from the keyboard and screen

Try not to let what you're seeing influence what you expect to see

Rule #2: Is It Plugged In?

Are you actually exercising the problem that you think you are?

Are you giving it the right test data?

Is it configured the way you think it is?

Is it the version you think it is?

Has the feature actually been implemented yet?

Why are you sure?

Maybe the reason you can't isolate the problem is that it's not there

Rule #3: Make It Fail

You can only debug things when you can see them going wrong

Find a test case that makes it fail

Then try to simplify it

Then try to find another one

Use data to weed out false hypotheses

If you can't make it fail repeatedly, you're going to have to rely on:

Post-mortem inspection (autopsies)

Logging

But logging distorts the program's behavior

A Note on Concurrency

Concurrent systems are hard to debug

Many times harder than sequential systems

Hard to keep track of everything that's going on

Ordering of events is unpredictable...

...and often not repeatable

The act of observing sometimes hides the bug

We have yet to invent a good debugger for concurrent systems

Have to rely on logging and inspection more often

Conclusion: use concurrency sparingly

Rule #4: Divide and Conquer

Once you have a test that makes the system fail, isolate the faulty sub-system

Faults can only occur upstream from where the system fails

Look at input to the module that's failing

If that's wrong, look at preceding module's input, and so on

Fail early, fail often

Use assert to trap consistency errors early

Every bug turns into an assertion

A Note on Assertions

Added to Java in Version 1.4

Check a Boolean condition

If true, do nothing

If false, throw an AssertionError

Can optionally provide a message (typically a string)

System displays this message when the assertion fails

Assertions often used to check inputs and outputs of methods

Defensive programming

Do not use to check correctness of user data

That's not a failure in the code

/**
 * Perform some operation on arg1 and arg2.
 * @param    arg1    First argument (must be non-null).
 * @param    arg2    Second argument (must be non-null and contain some data).
 * @return   A non-empty strict sub-string of arg2.
 */
public String method(Object arg1, String arg2) {
    assert arg1 != null : "First argument is null";
    assert (arg2 != null) && (arg2.length() > 0) :
           "Second argument is null or empty";
    ...lots of code here...
    assert (result != null) && (result.length() < arg2.length()) :
           "Invalid result";
    return result;
}

A Note on Modularity

Modularity aids debugging

As well as design and re-use

Can't isolate a faulty sub-system if there are no sub-systems

Widely-used modules are less likely to contain bugs

Not the same as saying "older modules"

But bugs often arise from interactions between modules

Often the hardest to track down

Rule #5: Change One Thing at a Time

Replacing random chunks of code unlikely to do much good

If you got it wrong the first time, what makes you think you'll get it right the second?

Or ninth?

Change one thing, then re-run your tests

This is why automated unit tests are so useful

Fixing one thing often breaks another

Remember: assert is your friend

And the next person to work on your code will find it easier to understand if you make your expectations explicit

Rule #6: Write It Down

If you try to keep the state of play in your head, you will lose track (and time)

"Wait, didn't A followed by B with half-width data cause those symptoms? Or was it B followed by A? Or was it A and B with full-width data?"

You're a scientist: work like one

Records particularly useful when getting help

People are more likely to listen when you can explain clearly what you did

A Note on Quality Assurance

Question: What do testers do while waiting for code?

Answer: read the specification and start writing down test cases

If there's no specification, you're wasting valuable time

And if QA isn't involved in drawing up the specification, to guarantee testability, you're wasting even more

Rule #7: Be Humble

If you can't find it in 15 minutes, ask for help

Just explaining the problem aloud is often enough

"Tell it to the duck"

Ego-less programming

Don't think of it as "your" code

And don't think of its flaws as your own

Keep track of your mistakes

After all, runners record their times, don't they?

You're most likely to fix what you're paying attention to

The Personal Software Process (PSP)

Common Errors

Complex errors are one-of-a-kind

That's part of what makes them complex

Simple errors crop up again and again

Use these patterns to drive testing

Looking where the light is, rather than where you dropped your keys

But a good way to check the quality of new code

After a while, you'll stop making these mistakes

Numbers

Zero

Smallest number

Smallest magnitude

Most negative

Largest number

Especially when allocating data

Off by one in a loop

Don't count integers when you could iterate directly through a collection

Structure Errors

null

Empty string

Empty path, filename with no extension, filename with only extension, etc.

Collections

Empty collection

Contains exactly one element

Contains maximum number of elements

Has duplicate elements

Contains one element multiple times

Searching

Match not found

Matching element not present

Matching element just outside search range

Exactly one match found

Match element on search boundary

Multiple matches found

Multiple references to a single object found

All objects match

Graphs

Includes trees, linked lists, graphs, etc.

Most large structures are in this category

Empty

Minimal non-empty (e.g. tree with just root)

Circular

Self-referential (head points to head)

Circular sub-structure

Depth greater than one

Including maximal depth, if appropriate

Where Else to Look

You are a creature of habit

Which means you repeat certain mistakes

Noticing them will help you avoid them

Copy-and-paste helps bugs breed

Any piece of code that is repeated in two or more places will eventually be wrong in at least one

Learn colloquial usage

Bloch: Effective Java. Addison-Wesley, 2001, 0201310058.

Tate: Bitter Java. Manning, 2002, 193011043X.


$Id: debug.html,v 1.1.1.1 2004/01/04 05:02:31 reid Exp $