Why does this C word histogram program print an overflow of 0?

106 views Asked by At

I am studying from the K&R C Programming Edition and one of the exercises (1-13) is to print a word length histogram which I mostly succeeded in doing. However, there is one issue.

if (nc < MAXLEN)
    ++wlen[nc];
else if (nc > MAXLEN)
    ++overflow;

According to the above, this should increment the value of the array value N so long as nc is less than the MAXLEN substitute, but if it is more than that, then it should increment the overflow count, right? I have a print statement at the end of the code that is supposed to print the overflow, but on inputs that have words greater than MAXLEN, the overflow count still prints a 0. Note: I previously had a simple else in the latter clause. Here is full code of the program. I need to understand the logic behind why this bug is happening.

#include <stdio.h>

/* Prints a histogram of the lengths of words. */

#define   MAXHIST  20
#define   MAXLEN  100
#define   MINLEN    0
#define   IN        1
#define   OUT       0

int main()
{
    int c, i, j, state;
    int wlen[MAXHIST];
    int nc, overflow;
    state = OUT;

    nc = overflow = 0;
    for (i = 0; i < MAXHIST; ++i)
        wlen[i] = 0;

    while ((c = getchar()) != EOF) {
        if (c == '\t' || c == '\n' || c == ' ') {
            state = OUT;
            if (nc < MAXLEN)
                ++wlen[nc];
            else if (nc > MAXLEN)
                ++overflow;
        }
        else if (state == OUT) {
            state = IN;
            nc = 1;
        }
        else {
            ++nc;
        }
    }

    for (i = 1; i < MAXHIST+1; ++i) {
        printf("%3d - %3d  ", i, wlen[i]);
            for (j = wlen[i]; j > MINLEN && j < MAXLEN; --j) {
                printf("*");
            }
        printf("\n");
    }
    printf("\nOverflow: %d\n", overflow);

    return 0;
}
1

There are 1 answers

0
derpirscher On

There are multiple problems with your code.

The most evident one is, that if MAXHIST < MAXLENGTH it won't count words with a length between MAXHIST and MAXLENGTH. In the worst case, you could get a segmentation fault, because you try to access memory you didn't allocate. Get rid of one of those, and only use a single value to determine the size of the histogram-array. The size of this array will also determine the maximum length of words you can count (or vice versa)

Furthermore, because of you condition while counting, it won't count words with a length of exactly MAXLENGTH, because then nc is neither < MAXLENGTH nor > MAXLENGTH.

And finally, your way of resetting the length is needlessly complicated. You don't need the state, just reset the nc = 0 on a whitespace instead of switching state.

And as a matter of style: I personally prefer to make the scope of variable as narrow as possible. Especially for loopcounters (like your i and j) I prefer to define them within the loop's scope, ie in the loop-declaration.

See the following code which fixes those issues.

#include <stdio.h>

/* Prints a histogram of the lengths of words. */

//the width of the histogram is defined by the maximum length to be counted
//ie MAXHIST will be MAXLEN + 1.
//the length of words to be counted
#define   MAXLEN  20
#define   MINLEN   0
int main()
{
    int c;
    int wlen[MAXLEN + 1];
    int nc = 0, overflow = 0;

    for (int i = 0; i <= MAXLEN; ++i)
        wlen[i] = 0;

    while ((c = getchar()) != EOF) {
        if (c == '\t' || c == '\n' || c == ' ') {
            //the current char is a whitespace, we must count the word

            //if the wordlength is between MINLEN and MAXLEN
            //increase the respective value
            //otherwise increase the overflow
            if (nc >= MINLEN && nc <= MAXLEN)
                ++wlen[nc];
            else
                ++overflow;

            //reset the word length
            nc = 0;
        }
        else {
            //the current char is not a whitespace, increase the length
            //of the current word
            ++nc;
        }
    }

    // I changed this to start from `0` to print also the count of empty words
    // ie two consecutive whitespaces
    for (int i = 0; i <= MAXLEN; ++i) {
        printf("%3d - %3d  ", i, wlen[i]);
            for (int j = 0; j < wlen[i]; ++j) {
                printf("*");
            }
        printf("\n");
    }
    printf("\nOverflow: %d\n", overflow);

    return 0;
}