isspace not working correctly?

1.8k views Asked by At

It is probably my code that's not working, but any white-space characters (\n, \t, \r, etc.) are not being converted to a space " ". As far as I can see, it looks like it should work, but it seg faults each time it hits a new line.

Edit: Sorry, It does change white-space characers to ' ', but it stops after the new line is hit. The program then runs through the code until that new line spot -- where it seg faults.

It also will not replace any of the white-spaces. The code draws in a .txt file, so if you want to run it, make a text file named alice.txt (or you can change the code) and include space characters in the file.

Can you please help me, I've been trying to solve this for hours with no avail. What am I doing wrong? Thanks!

#include <stdio.h>
#include <ctype.h>
#include <stdio.h>
#include <string.h>

#define LEN 4096

void upper(char *tok, FILE *out);
void rstrip(char *tok, FILE *out);

int main ()
{
    char *tok;  //tokenizer
    char buf[LEN];
    FILE *in = fopen("alice.txt", "r");
    FILE *out = fopen("out.txt", "w");
    int len = 0;

    while (fgets(buf, LEN, in)) {
        /* cleans all line breaks, tabs, etc, into space*/
        while (buf[len]) {
            printf("%c", buf[len]); //Error checking, prints each char of buf
            if (isspace(buf[len]))  //isspace not working properly? not changing \t, \r, etc to ' ' */
                buf[len] = ' ';     //not replacing
            if (buf[len] < 0)   //added cuz negative character values were being found in text file.
                buf[len] = ' '; 
            len++;
        }

        /*parses by words*/
        tok = strtok(buf, " ");
        rstrip(tok, out);
        while (tok != NULL) {
            tok = strtok(NULL, " ");
            rstrip(tok, out);
        }
    }

    fclose(in);
    fclose(out);
    return 0; 
}

/*makes appropiate words uppercase*/
void upper(char *tok, FILE *out)
{
    int cur = strlen(tok) - 1; //current place

    while (cur >= 0) {
        tok[cur] = toupper(tok[cur]);
        printf("%s\n", tok); 
        fprintf(out, "%s", tok);
        cur--;
    }

}

/*checks for 'z' in tok (the word)*/
void rstrip(char *tok, FILE *out)
{
    int cur = strlen(tok) - 1; //current place

    printf("%s", tok);
    while (cur >= 0) {
        if (tok[cur] == 'z')
            upper(tok, out);
        cur--;
    }
}
2

There are 2 answers

0
David C. Rankin On

In addition to validating that your FILE *streams are open, you also need to validate values passed to your functions. When strtok completes tokenizing, it returns NULL. You do not want to pass NULL to rstrip. E.g:

void rstrip(char *tok, FILE *out)
{
    if (!tok) return;          // validate tok

    int cur = strlen(tok) - 1; //current place

You need to do the same for upper. Simply fixing those issues go a long way. After incorporating the validations and the changes suggested by J.L.:

input:

$ cat alice.txt
#include <stdio.h>
int func()
{
z z z z z
}
int main(void)
{
    printf("%d\n",func());
    return 0;
}

output:

$ ./bin/ctypehelp
#include <stdio.h>
#include<stdio.h>int func()
intfunc(){
{ z z z z z
zZ
zZ
zZ
zZ
zZ
}
}int main(void)
intmain(void){
{    printf("%d\n",func());
printf("%d\n",func());    return 0;
return0;}

out.txt:

$ cat out.txt
ZZZZZ

Work on these improvements and post back when you are stuck again.

1
Jonathan Leffler On

You set len = 0; in the wrong place.

You need:

while (fgets(buf, LEN, in) != 0)
{
    for (int len = 0; buf[len] != '\0'; len++)
    {
        printf("%c", buf[len]);
        if (isspace((unsigned char)buf[len]))
            buf[len] = ' ';
        if (buf[len] < 0)
            buf[len] = ' ';
    }
    …rest of loop…
}

This ensures you set len to 0 for each line that is read. You also need to ensure that the argument to isspace() is valid — that means it is an int and must either be EOF or the value corresponding to an unsigned char.

The C standard says (referring to the arguments for the is*() functions in <ctype.h>:

In all cases the argument is an int, the value of which shall be representable as an unsigned char or shall equal the value of the macro EOF.