C - Function read(file,buffer,bytes to read) breaking a string

725 views Asked by At

I'm trying to read a file with 1024 lines with 9 times the same letter in each line and returning if it finds a line that doesn't match this terms.

The file is as follow but with 1024 lines:

eeeeeeeee
eeeeeeeee
eeeeeeeee

Code:

fd = open(fileName, O_RDONLY);
lseek(fd,0,SEEK_SET);


if(flock(fd, LOCK_SH) == -1)
        perror("error on file lock");

if(fd != 0){

    read(fd, lineFromFile, (sizeof(char)*10));
    arguments->charRead = lineFromFile[0];

    for(i=0; i < 1024; i++){        
        var = read(fd, toReadFromFile, (sizeof(char)*10));  
        if(strncmp(toReadFromFile,lineFromFile,10) != 0 || var < 10){           

            arguments->result = -1;
            printf("%s \n\n",toReadFromFile);
            printf("%s \n",lineFromFile);
            printf("i %d var %d  \n",i,var);                
            free(toReadFromFile);
            free(lineFromFile);
            return ;
        }                       
    }
}

Output:

> eeeee
eeee 

eeeee
eeee 
i 954 var 6 

I have 5 different files with different letters and every single one gives this output in that specific line (954) and the line is correct with the letter writen 9 times with a \n in the end.

Any ideas why this could be happening? If i don't use the lseek it works fine but i need the lseek to divide the file in several parts to be tested by different threads. I put the 0 index in the lseek for simplification to show you guys.

Thanks.

2

There are 2 answers

3
JS1 On BEST ANSWER

It looks like you are looking for "eeeee\neeee" instead of "eeeeeeeee\n". Which means your file should should start like this:

eeeee
eeeeeeeee
eeeeeeeee

and end like this:

eeeeeeeee
eeee

If your file ends like this:

eeeeeeeee
eeeeeeeee

Then when you get to the last line, it will fail because you will only read "eeeee\n" instead of "eeeee\neeee".

Given the new information in your comment, I believe the problem is that you should not be seeking to the middle of lines (in this case 342 and 684). You should seek to an even multiple of the expected string (like 340 and 680). Also, line 954 is not where the problem happened. It should be line 954 + X, where X is the line you seeked to.

2
John Bollinger On

Whatever other problems your program may have, it certainly has this: the read() function is not guaranteed to read the full number of bytes requested. It will read at least one unless it encounters an error or the end of the file, and under many circumstances it does read the full number of bytes requested, but even when there are enough bytes remaining before the end of the file, read() may read fewer bytes than requested.

The comments urging you to use a higher-level function instead are well considered, but if you are for some reason obligated to use read() then you must watch for cases where fewer bytes are read than requested, and handle them by reading additional bytes into the unused tail end of the buffer. Possibly multiple times.

In function form, that might look like this:

int read_all(int fd, char buf[], int num_to_read) {
    int total_read = 0;
    int n_read = 0;

    while (total_read < num_to_read) {
        n_read = read(fd, buf + total_read, num_to_read - total_read);
        if (n_read > 0) {
            total_read += n_read;
        } else {
            break;
        }
    }

    return (n_read < 0) ? n_read : total_read;
}