SEEK_HOLE always point to the end of file

1.1k views Asked by At

I'm dealing with the problem of APUE to write a program somehow like cp to copy files(Chapter 4 Problem 4.6). If the file contains holes(or sparse files) '\0's in the gaps shall never be coped. The ideal approach is to read and write block by block, whose size was determined by lseek(fd, current_off, SEEK_HOLE). I took /bin/ls as example. But evertime I lseek this file (or other files) the offset of file is always set to the end of file. I've checked this post but there seems to be no satisfactory answers. Here is my codes:

#include <stdio.h>
/* and other headers */

int main(void) {
    int fd;
    off_t off;
    fd = open("/bin/ls", O_RDONLY);
    if ((off = lseek(fd, 0, SEEK_HOLE) == -1)
        exit(-1);
    printf("%d\n", off);
    return 0;
}

My kernel is linux 3.13.0-rc3 pulled from latest stable tree and my fs is ext4. If lseek is unavailable, would it be proper to regard any '\0' as the beginning of a hole? Thanks for your answers.

1

There are 1 answers

2
Kristof Provost On BEST ANSWER

From 'man lseek' (man pages are your friend. First place to look for information.)

       SEEK_HOLE
          Adjust the file offset to the next hole in the file greater than
          or equal to offset.  If offset points into the middle of a hole,
          then the file offset is set to offset.  If there is no hole past
          offset,  then the file offset is adjusted to the end of the file
          (i.e., there is an implicit hole at the end of any file).

In other words, you're seeing entirely expected behavior. There's no hole in ls, so you're getting a hole at the end of the file.

You can create a sparse file for testing with dd:

dd if=/dev/zero of=sparsefile bs=1 count=1 seek=40G

As for your final question: No, that's not reasonable. It's entirely likely that files will have 0 bytes in them. This does not indicate that they're a sparse file.