Aborting on Programming Errors

Programming Errors in Code

One class of error that our code can encounter is a programming error — that is, a problem with the code itself. These errors could be as small and subtle as an off-by-one error in a loop, or as large as a flaw in the implementation or design of an important algorithm. What should our code do when it runs into this type of error? Should it attempt to recover gracefully? The answer, in most cases, is no: we shouldn’t be afraid just to make our program abort.

Common Types of Error

Broadly speaking, there are three types of error that our code can encounter. The first is user error: the user is asked to input a number, and they instead input the word “dog”. The second kind is a system error, such as the system running out of memory. The third kind is a programming error, where our code winds up in an invalid state because of an error in the code itself.

Recovering gracefully from user errors is, generally speaking, a requirement. It’s impossible to predict how users will behave with our applications, and we should always write our code to safeguard against every possible input.

But, what about system errors and programming errors? Today’s blog post will focus on programming errors (dealing with system errors is a blog post unto itself).

Use Assertions Liberally

The best way to capture programming errors is to use assertions. By asserting conditions that are expected to be true at critical points in our code, we can not only better document how our code is expected to work, but also catch smaller errors before they cascade into other modules in our program.

In C, the standard library function for performing assertions is assert(3). That is, you can read about it by running:

% man 3 assert

The usage of the function is assert(condition), where condition is something our code expects to be true. If condition is false, the program will automatically be terminated by abort(3).

Progress Bar Without Assertions

Let’s dive into an example, which implements a progress bar in C. The following header, progress_bar.h declares the interface for a progress bar. The interface allows us to allocate a new progress bar, increment the number of completed tasks, and output the current progress of the overall operation:

#ifndef VCS_PROGRESS_BAR_H_
#define VCS_PROGRESS_BAR_H_

struct progress_bar {
    unsigned int complete;
    unsigned int total;
};

struct progress_bar *pb_alloc(unsigned int total);
void pb_complete_task(struct progress_bar *pb);
void pb_output(const struct progress_bar *pb);

#endif

The following implementation of the progress bar, progress_bar.c, incorrectly attempts to recover silently from any programming errors. Note the implementation of pb_output(const struct progress_bar *), which corrects for the progress bar being in multiple types of invalid state: zero total tasks existing; and, more tasks being complete than there are total tasks.

#include "progress_bar.h"

#include <stdio.h>
#include <stdlib.h>

struct progress_bar *pb_alloc(unsigned int total)
{
    struct progress_bar *ret =
        calloc(1, sizeof(struct progress_bar));
    ret->complete = 0;
    ret->total = total;
    return ret;
}

void pb_complete_task(struct progress_bar *pb)
{
    pb->complete++;
}

void pb_output(const struct progress_bar *pb)
{
    double percent;
    if (pb->total == 0 || pb->complete > pb->total) {
        percent = 1.0;
    }
    else {
        percent = (double)pb->complete / pb->total;
    }
    printf("%.02lf%% complete\n", percent * 100);
}

Rather than catching errors when they occur — specifically, the creation of a progress bar with no tasks, or the over-tallying of completed tasks — the progress_bar module shifts the responsibility for dealing with invalid states into the pb_output(const struct progress_bar *) function. Dealing with these potentially invalid states makes the output function unnecessarily complicated.

Beyond just adding unnecessary complexity to the code that outputs the progress bar, the implementation of the progress bar above squanders an opportunity to catch serious programming errors in our code. Consider the following program, with a critical error in it, that uses the progress bar module:

#include "progress_bar.h"

#include <stdlib.h>

struct task
{
    int id;
};

int main()
{
    unsigned int num_tasks = 3;
    unsigned int i;
    struct task *tasks =
        calloc(num_tasks, sizeof(struct task));
    struct progress_bar *pb = pb_alloc(num_tasks);

    for (i = 0; i <= num_tasks; i++) {
        pb_output(pb);
        tasks[i].id = i;
        pb_complete_task(pb);
    }
    pb_output(pb);

    return 0;
}

The code creates a list of tasks, then iterates through the list of tasks performing an operation for each. In this case, the operation is just setting the ID of each task, but in practice such an operation could be complex and require a significant amount of time — hence, the need for a progress bar.

But, there’s a subtle error in the code above. There’s an off-by-one error in the loop, and it results in data being written to memory beyond the allocated space. This issue could be a serious security flaw in real code.

While this program may crash for you (because of the write beyond allocated memory), it didn’t crash when I ran it on my computer. Instead, I received the clearly incorrect output:

0.00% complete
33.33% complete
66.67% complete
100.00% complete
100.00% complete

Silently correcting for a programming error in one module can hide the underlying cause of the error, and lead to even more significant flaws in our overall program.

Progress Bar With Assertions

Let’s remove the unnecessary complexity from the pb_output(const struct progress_bar *), and make it do exactly what it should do:

void pb_output(const struct progress_bar *pb)
{
    double percent = (double)pb->complete / pb->total;
    printf("%.02lf%% complete\n", percent * 100);
}

Now, instead of silently correcting for programming errors, let’s instead add assertions to the points in the code where these two errors could be caught at their earliest: creating a progress bar with zero tasks, and completing more tasks than there are in the progress bar.

struct progress_bar *pb_alloc(unsigned int total)
{
    assert(total > 0);
    struct progress_bar *ret =
        calloc(1, sizeof(struct progress_bar));
    ret->complete = 0;
    ret->total = total;
    return ret;
}

void pb_complete_task(struct progress_bar *pb)
{
    assert(pb->complete < pb->total);
    pb->complete++;
}

This change not only helps document our assumptions about how the progress bar will be used, but it also catches an unrecoverable error. If the code using the progress bar is indicating to the progress bar module that more tasks have been completed than exist, then clearly there is a programming error in the calling code, and one that we can’t (and shouldn’t) meaningfully recover from.

With the assertions in place, the program terminates immediately after the buffer overflow error, rather than letting the erroneous state of our program persist:

0.00% complete
33.33% complete
66.67% complete
100.00% complete
Assertion failed: (pb->complete < pb->total), function pb_complete_task, file progress_bar.c, line 21.

Exercise

I’ve purposely left out several assertions that should be included in the code. Specifically, there isn’t any check that the return values of each call to calloc(3) are non-NULL.

Furthermore, the memory allocated by calloc(3) is never freed with calls to free(3).

I encourage students to add these missing assertions to the code, along with the missing calls necessary to free the allocated memory.

Optionally, we could also add assertions to the pb_output(const struct progress_bar *pb) function to verify that the current state of the progress bar is valid, though personally I feel that may be so many assertions as to clutter the code.

Summary

Programming errors are one class of error that our code can encounter. But, the correct solution is not to make our code silently correct for states it should never be in. Doing so may not only unnecessarily shift the burden of code complexity to where it need not be, but it can also disguise more significant problems in the program as a whole.

The best way to capture programming errors is to use assertions, and catch expected conditions that are broken as quickly as possible. Then, because it’s typically impossible to recover meaningfully from programming errors, it usually makes sense just to abort the program when these errors are detected.

For more tips, and to arrange for personalized tutoring for yourself or your study group, check out Vancouver Computer Science Tutoring.