Die Statistik ist für den Politiker das, was für den Betrunkenen die Laterne ist.
Sie dient weniger der Erleuchtung als der Aufrechterhaltung des eigenen Standpunktes.
(Roland Koch)

At least the ext[2-4] filesystems support files with holes. They can be created in several ways, one possibility is the seek-argument of the dd command:

$ dd if=/dev/urandom of=test bs=4096 count=10 seek=10
10+0 records in
10+0 records out
40960 bytes (41 kB) copied, 0.0126733 s, 3.2 MB/s
$ wc -c test
81920 test


So I was interested in whether it is possible to actally find these holes. ZFS and XFS have their own API for that.

For ext*, there is also a possibility to find holes. Basing on the FIBMAP Ioctl (and similar), you need to have the CAP_SYS_RAWIO capability (that is, usually you have to be root). If you only want to watch the holes in the files, you can use for example hdparm (as shown here):

$ sudo hdparm --fibmap test

test:
 filesystem blocksize 4096, begins at LBA 0; assuming 512 byte sectors.
 byte_offset  begin_LBA    end_LBA    sectors
       40960    8928120    8928199         80


Another possibility is to use the filefrag-utility, which is contained in the debian squeeze package e2fsprogs (as shown here):

$ sudo filefrag -v test
Filesystem type is: ef53
File size of test is 81920 (20 blocks, blocksize 4096)
 ext logical physical expected length flags
   0      10  1116015              10 eof
test: 2 extents found


Now if you really want to use your own program, here is a nice example code I found, on which I based my own code:


#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <linux/fs.h>


int main (int argc, char* argv[]) {
  int fd, blocknum, blocksize;
  struct stat fileinfo;

  if (argc < 1) {
    fprintf(stderr, "Syntax Errof\n");
    exit(EXIT_FAILURE);
  }

  if ((fd = open(argv[1], O_RDONLY)) < 0) {
    int errnum = errno;
    fprintf(stderr, "Cannot open '%s': %s\n", argv[1], strerror(errnum));
    exit(EXIT_FAILURE);
  }

  if (ioctl(fd, FIGETBSZ, &blocksize) < 0 ) {
    int errnum = errno;
    fprintf(stderr, "Cannot get blocksize: %s\n", strerror(errnum));
    exit(EXIT_FAILURE);
  }

  if (fstat(fd, &fileinfo) < 0) {
    int errnum = errno;
    fprintf(stderr, "Stat failed: %s\n", strerror(errnum));
    exit(EXIT_FAILURE);
  }

  blocknum = (fileinfo.st_size + blocksize - 1) / blocksize;

  printf("Filename: %s\nBlocksize: %d\nBlocknum: %d\n",
         argv[1], blocksize, blocknum);
 
  int i;
  for (i = 0; i < blocknum; i++) {
    int block = i;
    if (ioctl(fd, FIBMAP, &block)) {
      printf("ioctl failed: %s\n", strerror(errno));
    }
    printf("%10d\t", block);
  }
  close(fd);
  printf("\n");
  exit(EXIT_SUCCESS);
}


The output:

$ sudo ./fibmap test
Filename: test
Blocksize: 4096
Blocknum: 20
         0
         0
         0
         0
         0
         0
         0
         0
         0
         0
   1116015
   1116016
   1116017
   1116018
   1116019
   1116020
   1116021
   1116022
   1116023
   1116024


And that is indeed a list of ten null-pointers and ten consecutive blocks.