Diffuse Gedanken eines Dreisterneprogrammierers.

At least the ext[2-4] filesystems support files with holes. They can be created in several ways, one possibility is the seek-argument of the dd command:

$ dd if=/dev/urandom of=test bs=4096 count=10 seek=10
10+0 records in
10+0 records out
40960 bytes (41 kB) copied, 0.0126733 s, 3.2 MB/s
$ wc -c test
81920 test


So I was interested in whether it is possible to actally find these holes. ZFS and XFS have their own API for that.

For ext*, there is also a possibility to find holes. Basing on the FIBMAP Ioctl (and similar), you need to have the CAP_SYS_RAWIO capability (that is, usually you have to be root). If you only want to watch the holes in the files, you can use for example hdparm (as shown here):

$ sudo hdparm --fibmap test

test:
 filesystem blocksize 4096, begins at LBA 0; assuming 512 byte sectors.
 byte_offset  begin_LBA    end_LBA    sectors
       40960    8928120    8928199         80


Another possibility is to use the filefrag-utility, which is contained in the debian squeeze package e2fsprogs (as shown here):

$ sudo filefrag -v test
Filesystem type is: ef53
File size of test is 81920 (20 blocks, blocksize 4096)
 ext logical physical expected length flags
   0      10  1116015              10 eof
test: 2 extents found


Now if you really want to use your own program, here is a nice example code I found, on which I based my own code:


#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <linux/fs.h>


int main (int argc, char* argv[]) {
  int fd, blocknum, blocksize;
  struct stat fileinfo;

  if (argc < 1) {
    fprintf(stderr, "Syntax Errof\n");
    exit(EXIT_FAILURE);
  }

  if ((fd = open(argv[1], O_RDONLY)) < 0) {
    int errnum = errno;
    fprintf(stderr, "Cannot open '%s': %s\n", argv[1], strerror(errnum));
    exit(EXIT_FAILURE);
  }

  if (ioctl(fd, FIGETBSZ, &blocksize) < 0 ) {
    int errnum = errno;
    fprintf(stderr, "Cannot get blocksize: %s\n", strerror(errnum));
    exit(EXIT_FAILURE);
  }

  if (fstat(fd, &fileinfo) < 0) {
    int errnum = errno;
    fprintf(stderr, "Stat failed: %s\n", strerror(errnum));
    exit(EXIT_FAILURE);
  }

  blocknum = (fileinfo.st_size + blocksize - 1) / blocksize;

  printf("Filename: %s\nBlocksize: %d\nBlocknum: %d\n",
         argv[1], blocksize, blocknum);
 
  int i;
  for (i = 0; i < blocknum; i++) {
    int block = i;
    if (ioctl(fd, FIBMAP, &block)) {
      printf("ioctl failed: %s\n", strerror(errno));
    }
    printf("%10d\t", block);
  }
  close(fd);
  printf("\n");
  exit(EXIT_SUCCESS);
}


The output:

$ sudo ./fibmap test
Filename: test
Blocksize: 4096
Blocknum: 20
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
   1116015
   1116016
   1116017
   1116018
   1116019
   1116020
   1116021
   1116022
   1116023
   1116024


And that is indeed a list of ten null-pointers and ten consecutive blocks.