Via The Old New Thing, I found this video about a "fault tolerant heap" in Windows 7. They analyzed a lot of program bugs, and found a relevant part of them being heap corruptions. As a temporary solution for the most common of these, they created a mode for running applications which uses techniques to prevent the most common of them.

On the one hand, this sounds like a very bad hack, which - from my purist point of view - should not be used at all. But on the other hand, it is a technical accomplishment. So, the criticism here does not go against the engineers from Microsoft. They probably have to face stupid programmers and stupid users every day, and just try to find ways of handling them more efficiently.

However, I consider this approach dangerous, in the sense that programmers may tend to just specify to "turn on the fault tolerant heap" (and let their setup do this) rather than programming cleanly. I can imagine this being a lot easier in many cases - and time is money. And then, in ten years, when a lot of new stupid "standard" software uses it, this will not be a feature anymore, but rather a necessity for the software to run.

I cannot really understand why this kind of bug is still an issue, anyway. There are good, fast and stable implementations of automatic memory managers out there, especially garbage collectors, which should be sufficient for most of the software (which is bloated anyway, so nobody will notice the difference). Even for macro assemblers like C++, if you want it desperately.

I mean, there are a lot of remaining bugs which cannot be prevented that way, but that kind of bug is a solved problem. And programmers can concentrate on other stuff, and will probably be more productive when not having to think about memory management.

Before I start, I clearly have to say that almost all websites suck! Especially official websites, like websites of universities or government agencies, on such websites, you mostly find everything except what you are searching for.


Source: xkcd


So why is the title of this post then about British websites? Well, I recently had to deal with some of them, for applications for vacancies. They all require cookies and JavaScript - Cookies are understandable somehow, but JS is not! And for cookies, simple session cookies should be sufficient, and they should not be public. Some of the sites even have stuff by Google - I really do not want this (usually) on a university website.

But the worst part of all is when there are .doc-files. Why the hell would anybody want a .doc file on a website? Many .doc files can be viewed using LibreOffice, but mostly there are several problems with the actual design.

On some recent sites, such .doc files have been used as "forms" one should fill out and then send back per e-mail. Which is extremely annoying. Not everybody has MS Word, MS Word is pretty expensive, and not even everybody uses Windows or Mac OS X. And OpenOffice sometimes makes mistakes. So: Just do not use .doc files!

And in any case: Do not rely on your form being applicable to everybody. Have somebody to answer e-mails when there is a problem with filling the form. One example why is given here.

You might argue that it is easy to say what is wrong, but hard to say what is right. So, here are several suggestions on how to (probably) do it right. Of course, without any warrianty.

Firstly, of course, one should discuss what exactly "right" means. The first thing is, of course, that the software to fill in the forms must be free of charge, and available on all - well, at least the mainly used (including Linux!) - systems, so everyone can use it. Ideally, there must be an open source software product. But of course, it is useless to try to make something run on all systems - there are so much screwed up configurations out there, that this is clearly impossible. The obvious solution: let a human control it. There is no need to do much, just checking whether everything in some filled form is right, and if not, sending a mail requesting the missing information, should be sufficient. But please, let somebody who "knows about computers" do that! The least helpfull thing is a secretary, asking you why you "not just open it with Word".

One alternative for filling forms is the pdf form format. It works with many programs, also open source software. The problem is that it is not widespread yet, and ... sucks ... somehow.
Similarly, there are Google Docs. The main disadvantage is: Google knows what you write. But ... who cares?

A possibility similar to pdf forms are pdf files which can be changed with software like Okular, Xournal, or by just printing, filling and scanning them. The main disadvantage is that probably most people are too stupid to use them. They do not have a scanner (and do not know that most copy shops have one), they do not know how to properly print pdf files (!!), let alone annotating them.
Alternatively, png images can be changed using MS Paint, Gimp, etc.

Also widespread is the support of Flash and Java applets. Why ever one would need Flash or Java for filling a simple form, this should be something everybody can use. Of course, both base on closed source technologies, but both have open source alternatives with which one could try to stay compatible with. Explicitly stating and maintaining this compatibility would of course be necessary. An advantage of Java applets is that this is a comparably old technology, so it is comparably stable.

And now ... I come  to the "revolutionary" part, which some webdesigners would kill me for even thinking about: Using old technology that is widespread und has proven stable!

For example, it may not look as good as a well-designed Word document with a nice shiny header, but still, a simple text document (.txt) which is designed to be viewed using a monospace font, should be sufficient for most purposes. The advantage is that vitally every system can show them. One problem, however, is the encoding. You can request the user to use UTF-8, but many users do not know what that means. Requesting the users not to use special characters is hard, when for example his name contains them. Good editors like Emacs and VI allow annotations that indicate the encoding. Notepad supports BOMs. It may not be perfect, but it should work in many cases. A major disadvantage is that every user can freely screw up the form. But this can be done with .doc files as well.

Finally, there is one technology which is widespread, takes care of encoding, and can be viewed by vitally every device: Good old web forms and HTTP. The main disadvantage is that you cannot just pass a file to the user, which he then mails you, but you have to provide some server application. However, this is normally not an issue. Creating some CSV-Output, such that the user can proceed working with the form later, when uploading it again, as well as creating a beautiful PDF from its input to mail to some secretary, can both be considered as solved problems. And - surprise - you do not need any JavaScript for this.

We have seen a basic introduction to infinity and its definition in part 1 of this series, and in part 2, we have seen that infinitary processes might have a finite outcome. In part 3, we have seen a lot of nice examples getting deeper into the topic of infinite curves and planes. However, we moved away from the actual question about how many elements a set has, as the curves are more demonstrative in the beginning. With this post, we want to turn back to this question again.

Let us recall how we defined infinity in part 1: A set is infinite if and only if there is a bijection into a proper subset.

While it is pretty clear that there are multiple finite sets with different numbers of elements, the question whether there are infinite sets that contain more elements than \mathbb{N} has not been answered yet. And while it is pretty clear that for two finite sets, we can always say which of them is greater, this is not clear when talking about infinite sets. The definition is pretty obvious: If there is a bijection from a set A into a subset of B, then A has more or equally many elements as B, we will denote this by |A|\le |B|.
So as said, the first question which arises is, whether two sets are always comparable. This follows directly from the well-ordering theorem, but we will not discuss this proof here, since it is sophisticated. We just keep in mind that for any two sets A and B, we have |A|\le |B| or |B|\le |A|.

Let us denote by |A|=|B| that the sets A and B contain equally many elements, that is, there is a bijection between them. Then the obvious question is whether |A|\le |B| and |B|\le |A| implies |A|=|B|. Do not let yourself confuse by the notation: This is not clear! Essentially, the question is whether when we have a bijection from A to a subset of B and a bijection from B into a subset of A, we also have a bijection between A and B.
By the Cantor-Bernstein-Theorem, this holds, but we will not discuss this here, and just accept that it works for all sets we will consider.

Now, when thinking about the "number" of Elements of the set \mathbb{N}_1 of positive integers, two obvious questions arise: Whether there is an infinite set smaller than \mathbb{N}_1, and whether there is an infinite set greater than \mathbb{N}_1. For the first question, it is sufficient to prove that every infinite subset of \mathbb{N}_1 has a bijection into \mathbb{N}_1. This can be done by the minimum principle defined in part 1, and is left to the interested reader. So indeed, \mathbb{N}_1 is the "smallest" infinite set, in the sense that every other infinite set has at least as much elements as \mathbb{N}_1.

In part 1, we have already seen, that the first obvious "larger" set, the set \mathbb{Z} of integers, is not larger than \mathbb{N}_1: The mapping \varphi(n)=(-1)^n\lceil\frac{n}{2}\rceil, that is, \varphi(0),\varphi(1),\varphi(2),\varphi(3),\varphi(4),\ldots is 0,-1,1,-2,2,-3,3,-4,4,\ldots, is a bijection. So in fact, there are as many non-negative integers, as there are integers at all.

The next obvious candidate is the set \mathbb{N}_0\times\mathbb{N}_0 of pairs (a,b) of nonnegative integers. You can imagine them as points of a "grid":


The question can now be seen more geometrical: Is it possible, to arrange these points on a curve? And of course, it is:



We get the sequence (0;0), (0;1), (1;0), (2;0), (1;1), (0;2), \ldots. This sequence is defined by the Cantor-Pairing-Function \pi. It is a bijection, and we call its inverse for the right side \pi_R and its inverse for the left side \pi_L, thus, \pi(\pi_R(a),\pi_L(a))=a.

So, \mathbb{N}_0\times\mathbb{N}_0 is not larger than \mathbb{N}_1. In fact, therefore, trivially, also the set of triples, quartuples, etc., of natural numbers are not larger than the set of natural numbers, as we can encode for example triples by (1;2;3)=(1;(2;3)).

So, the next obvious candidate would be to unite all of these sets, gaining the set \mathbb{N}_1^\omega of finite sequences of natural numbers. As you might expect, this set is also countable, and there is a more general theorem that states that every union of countably many countable sets is countable itself, which would hold here. But in this case, we can show it more directly, by giving a bijection. Firstly, let \pi_n:\mathbb{N}_1^n\rightarrow\mathbb{N}_1 be the bijection from the n-tuples into the natural numbers (where \pi_1(x)=x), which exists as shown above. Then define

 \pi_\omega(x_1,\ldots,x_n)=\pi(n,\pi_n(x_1,\ldots,x_n))

Its inverse is given by

 \pi_\omega^{-1}(x)=\pi_{\pi_{L}(x)}^{-1}(\pi_R(x))=\left\{\begin{array}{cl} \pi_{R}(x) & \mbox{ for } \pi_{L}(x)=1 \\ \pi_2^{-1}(\pi_R(x)) &\mbox{ for } \pi_{L}(x)=2 \\ \pi_3^{-1}(\pi_R(x)) &\mbox{ for } \pi_{L}(x)=3 \\ 
\pi_4^{-1}(\pi_R(x)) &\mbox{ for } \pi_{L}(x)=4 \\ \cdots & \cdots \end{array}\right.


As this looks rather theoretic and confusing, here is a Java-Program which actually does this (I hope):

public class CantorStuff {

    public static int cantorPair (int x, int y) {
    x--; y--;
    return (x+y)*(x+y+1)/2+y+1;
    }

    public static int[] cantorPairToNumbers (int z) {
    z--;
    int w = (int) Math.floor((Math.sqrt(8*z+1)-1)/2);
    int t = (w*w+w)/2;
    int y = z-t;
    return new int [] { w-y+1, y+1 };
    }

    public static int sequenceToNum (int[] seq) {
    int ret = seq[seq.length - 1];
    for (int i = seq.length - 2; i >= 0; i--) {
        ret = cantorPair(seq[i], ret);
    }
    return cantorPair(seq.length, ret);
    }

    public static int[] numToSequence (int num) {
    int[] ctpn = cantorPairToNumbers(num);
    int[] ret = new int[ctpn[0]];
    for (int i = 0; i < ret.length - 1; i++) {
        ctpn = cantorPairToNumbers(ctpn[1]);
        ret[i] = ctpn[0];
    }
    ret[ret.length - 1] = ctpn[1];
    return ret;
    }

    public static void main (String[] args) {

    if (args[0].equals("num")) {
        int num = Integer.parseInt(args[1]);
        int[] seq = numToSequence(num);
        for (int i = 0; i < seq.length; i++) {
        System.out.print(seq[i] + " ");
        }
        System.out.println("");
    } else if (args[0].equals("seq")) {
        int len = args.length - 1;
        int[] seq = new int[len];
        for (int i = 1; i <= len; i++) {
        seq[i-1] = Integer.parseInt(args[i]);
        }
        System.out.println(sequenceToNum(seq));
    }
    }
}


So in fact, not even the set of finite sequences of natural numbers is larger than the set of natural numbers.

Ok, the next canonical candidate would be the set of rational numbers, \mathbb{Q}. But every rational number is of the form \pm\frac{a}{b},a,b\in\mathbb{N}, where a and b are coprime, and this form is unique except for 0. That is, \mathbb{Q} can be regarded as a subset of \{+,-\}\times\mathbb{N}_0\times\mathbb{N}_1, which means that it is, actually, countable.

Ok, the rationals are not larger than the set of natural numbers. But now, we finally approach to a set which is actually larger: The set \mathbb{R} of real numbers.

In part 1 we have seen the following bijection, which makes it sufficient to show that there is no bijection f:\mathbb{N}\rightarrow]0;1[, because the connection of two bijections is a bijection again, obviously.



So assume there was a bijection f:\mathbb{N}\rightarrow]0;1[. Every element of ]0;1[ has a decimal representation, say, f(n)=\sum\limits_{i=1}^\infty 10^{-i}a_{ni}= 0.a_{n1}a_{n2}a_{n3}\ldots with a_{ij}\in\{0,1,2,3,4,5,6,7,8,9\}, and we may assume that, if possible, this representation is stationary, that is, ends with infinitely many zeroes (we must assume this, since for example 0.1=0.0\overline{9} as we have seen in part 1).

Now define for all i\in\mathbb{N}

b_i = \left\{\begin{array}{cl} 1 & \mbox{ for } a_{ii}=0 \\ 0 & \mbox{ otherwise} \end{array}\right.

Then there is a real number b:=\sum\limits_{i=1}^\infty 10^{-i}b_i=0,b_0b_1b_2\ldots, and this real number is distinct from all f(j),j\in\mathbb{N}. So, f cannot be surjective. Contradiction.

The same argument also proves that there is no surjective such f, that is, \mathbb{R} contains strictly more elements. The powerset \mathfrak{P}(\mathbb{N}), the set of subsets of \mathbb{N}, is another example, and the proof for this is somewhat similar: Considering a function f:\mathbb{N}\rightarrow\mathfrak{P}(\mathbb{N}), the set \{i\in\mathbb{N}|i\not\in f(i)\} is not in the range of f (proof: exercise!), so f cannot be surjective.

Now of course, the question arises whether \mathfrak{P}(\mathbb{N}) is larger or smaller than \mathbb{R}. They are of equal cardinality, and it is possible to give a bijection, but it is rather sophisticated, so this is left as an exercise for the interested reader.

In a former post (on my old blog), I already had the idea of using SIGSEGV-handlers to create a crude form of lazy evaluation. However, this was not very usable yet. I meanwhile engineered around a bit, and found out a lot of interesting stuff about memory management, and signal handling in general. Though I do not think that this is a good way of doing lazy evaluation anymore, it was worth learning this much. Still, with the proper kernel support, I think lazy evaluation could be implemented easier, it is just that I do not think that SIGSEGV-handlers are the appropriate way to do this. However, keep in mind that I am not an expert, I might be wrong with what I say, and I am open to constructive comments of any kind.

Also, this post is long and I give a lot of intermediate steps until the final end, since the outcome is not as interesting as the many facts one can learn from it, I think.

So, since in the beginning, I did some assembler-haxxory, I wondered how exactly this signal handling stuff worked. If I had to create a signal handling architecture, I would probably just save the context of the program into a special reserved memory region, and then call the signal handler, leaving everything else to this handler. That is due to the fact that I think it is the best for a kernel to keep quiet as long as possible.

Linux does something else however. It maps a memory region into the process, called the vsyscall-page. It used to be on the next-to-last block in the address space, but meanwhile it is, as far as I read, on a randomized position in memory. There, the current system time is saved, and some other functions which can be called instead of calling a syscall, so this is an optimization.

Now, when a signal handler is called, it pushes a return adress on the stack, such that when the handler returns, the process jumps into this vsyscall-page. That page then calls the sigreturn-syscall (see 5.6@here). Two interesting things related to that topic: Handling signals in assembly, and the list of Linux syscalls.

So far with the theory. Let us dig in a bit deeper. One problem with the approach I had in my former post on this topic was that the actual memory page had to be allocated after the segmentation fault. It was not possible to redirect the process to some other memory region. For this, two things are necessary.

Firstly, every segmentation fault comes from some memory accessing assembler instruction, which dereferences some register. When jumping back from the handler, this instruction is called again, and causes the SIGSEGV to be sent again. We have to set the register to a new value, pointing to the place where the lazily allocated object is saved.

The register state is saved in a ucontext-structure documented in setcontext(2). It contains a non-portable mcontext-structure, which is defined in sys/ucontext.h. It is not even portable across x86_32 and x86_64. It has an array gregs[], in which the register values are saved, and from which they are restored. If we know the dereferenced register, we can therefore set it to something else. With a bit assembler magic, we can therefore already get the following code:

#include <stdio.h>
#include <signal.h>
#include <string.h>
#include <stdlib.h>
#define __USE_GNU
#include <ucontext.h>

// set this to REG_ECX on x86_32
#define myREG REG_RCX

int k = 42;

void handle_segv(int segv, siginfo_t* siginfo, void* ucontext) {
  ucontext_t* uc = (ucontext_t*) ucontext;
  uc->uc_mcontext.gregs[myREG] = (greg_t) &k;
}

void cause_segv (void* ptr) {
  int d;
  // d =  *ptr;
  asm ("movl (%%ecx), %%edx" : "=d"(d) : "c"(ptr));
  printf("%d\n", d);
}

int main (void) {
  struct sigaction q;
  bzero(&q, sizeof(q));
  q.sa_sigaction = handle_segv;
  q.sa_flags = SA_SIGINFO;
  sigaction(11, &q, NULL);
  cause_segv(NULL);
}

The output of this program is "42". And even though the program segfaults, it does not occur in dmesg.

However, we do not want assembly code, if possible. That is, we do not want to assume, that it is the ECX-Register which was dereferenced, but we need to know which register was. Therefore, we use a library for x86 disassembly: libudis86. We just disassemble the code at the instruction pointer in our gregs[]-array, and see what it does. Here is the code we get:

#include <stdio.h>
#include <signal.h>
#include <string.h>
#include <stdlib.h>
#define __USE_GNU
#include <ucontext.h>
#include <udis86.h>

/* This code is only for x86_64 */

int k = 42;

void handle_segv(int segv, siginfo_t* siginfo, void* ucontext) {
  ucontext_t* uc = (ucontext_t*) ucontext;

  /* do disassembly at from IP */
  ud_t ud_obj;
  ud_init(&ud_obj);
  ud_set_mode(&ud_obj, 64);
  ud_set_syntax(&ud_obj, UD_SYN_ATT);
  ud_set_input_buffer(&ud_obj, (unsigned char*) uc->uc_mcontext.gregs[REG_RIP],
		      10);

  /* was disassembly successful? */
  if (!ud_disassemble(&ud_obj)) {
    printf("Disassembly fail!\n");
    exit(-1);
  }

  /* is disassembly a memory-operation? */
  struct ud_operand op0 = ud_obj.operand[0];
  struct ud_operand op1 = ud_obj.operand[1];
  struct ud_operand op;

  if (op0.type == UD_OP_MEM) {
    op = op0;
  } else if (op1.type == UD_OP_MEM) {
    op = op1;
  } else {
    printf("Instruction unknown\n");
    exit(-1);
  }

  /* find out the register - this part is clumsy as we have two sets
     of constants */

  int setreg;
  switch (op.base) {
  case UD_R_RAX:
  case UD_R_EAX:
    setreg = REG_RAX;
    break;
  case UD_R_RCX:
  case UD_R_ECX:
    setreg = REG_RCX;
    break;
  case UD_R_RDX:
  case UD_R_EDX:
    setreg = REG_RDX;
    break;
  case UD_R_RBX:
  case UD_R_EBX:
    setreg = REG_RBX;
    break;
  case UD_R_RSP:
  case UD_R_ESP:
    setreg = REG_RSP;
    break;
  case UD_R_RBP:
  case UD_R_EBP:
    setreg = REG_RBP;
    break;
  case UD_R_RSI:
  case UD_R_ESI:
    setreg = REG_RSI;
    break;
  case UD_R_RDI:
  case UD_R_EDI:
    setreg = REG_RDI;
    break;
  case UD_R_R8:
  case UD_R_R8D:
    setreg = REG_R8;
    break;
  case UD_R_R9:
  case UD_R_R9D:
    setreg = REG_R9;
    break;
  case UD_R_R10:
  case UD_R_R10D:
    setreg = REG_R10;
    break;
  case UD_R_R11:
  case UD_R_R11D:
    setreg = REG_R11;
    break;
  case UD_R_R12:
  case UD_R_R12D:
    setreg = REG_R12;
    break;
  case UD_R_R13:
  case UD_R_R13D:
    setreg = REG_R13;
    break;
  case UD_R_R14:
  case UD_R_R14D:
    setreg = REG_R14;
    break;
  case UD_R_R15:
  case UD_R_R15D:
    setreg = REG_R15;
    break;
  default:
    printf("Register not supported!\n");
    exit(-1);
  }

  /* set the register - as before */
  uc->uc_mcontext.gregs[setreg] = (greg_t) &k;
}

void cause_segv (int* ptr) {
  int d;
  d =  *ptr;
  printf("%d\n", d);
}

int main (void) {
  struct sigaction q;
  bzero(&q, sizeof(q));
  q.sa_sigaction = handle_segv;
  q.sa_flags = SA_SIGINFO;
  sigaction(11, &q, NULL);
  cause_segv(NULL);
}

With this, we can lazily evaluate one object.

Secondly, we need every memory access to be dereferenced twice, so we need a local pool of addresses - so we can do it for more than one object.

We will use Matthias Benkard's implementation of Patricia Trees to achieve a dynamic array from protected pointers to actual functions. We also need a slice allocator. Actually, creating that code was rather sophisticated, and I do not want to post anything as "finished" yet, if you are interested in this most general code, look at my git repository for this project. It defines a function for creating list comprehensions from a function from integers to integers, which may depend on the actual list.

However, that code is not threadsafe. Making it more threadsafe is one of my aims with this project, if I keep working at it.

Having this work so far, of course I do not think that this is a good solution. Segmentation faults are faults, by design, and they should be considered faults. That does not mean that they should not be handled properly (which most software does not do), but they should not be produced on purpose. Rather, the kernel should add another state of memory pages, say "lazy pages", which may be allocated, but on which also a signal is sent, or some other sort of interrupt.

Some other sort of interrupt would be better, since signal handling involves two context switches. In fact, it would be sufficient to just push the state of the thread on the stack, and jump to a handling function, without having to call sigreturn. Handling such an interrupt does not involve anything on the background of the process, so the state can be restored by the handling function itself, at least mostly. Maybe a flag, indicating whether some sort of sigreturn is to be called, could be helpful, for example, if the kernel had to extend the stack or something similar.

This is part 3 of my series on infinity.

In part 2 we have seen that there are infinitary processes with a finite outcome. In this part, we will see something, which can at least "philosophically" be considered as the other way around: How to embed infinity into finite objects.

In the end of part 1, we have already seen the following bijection:

It embeds the whole real line into a finite open interval, so reasonings about the real line become reasonings about this interval, and reasonings about this interval become reasonings about the whole real line. However, this is a bijection, mapping something of infinite length onto something of finite length, but it is not directly something we intuitively consider as an "embedding", and we might ask the question whether it is possible for a finite object to contain a line of infinite length, for example. Now consider the following construction: You start with a square (of side length 1). You cut this square into two halfs, and into the left half, you put an isosceles triangle of height 1 and width 0.5. You cut the other half again into two halfs, and put an isosceles triangle of height 1 and width 0.25 into the left one. Proceed this infinitely often, cutting the right rectangle into two halfs, and embedding an isosceles triangle into the left one. The following graphic shows this:



The length of the isosceles lines (blue in the graphic) of the n-th triangle have length 2\sqrt{1+\frac{1}{4^{n+2}}} (which follows by one of Pythagoras' theorems), but the important fact is that it is always strictly greater than 2, since the height is already 1, and each of the two lines must trivially be longer. All of the isosceles lines together form a line, the line colored blue in the graphic, and this line must be of infinite length. So in fact, it is possible for a square to contain an infinite line.

Here, the special character of lines must be taken to account: Lines have no width. Lines are ideal objects, which only have a length, but do not have a width. Of course, every line we draw is not ideal, in the sense that - to be able to see it - it needs to have a width. Every line which is actually drawn is only an approximation of lines. Still, the ideal object "line" is not useless: It is a formal object which can be approximated with (probably) arbitrary precision, and its definition was a success and is implicitly used in huge parts of science.

Let us consider another graphic: Assume you draw a square of side length 1. Then centrized into this square, you draw a square of side length 0.5, and inside this, you draw a further square of side length 0.25, and so on. The following graphic shows this:



The circumference of  the n-th square has length \frac{4}{2^{n-1}}. So the sum of the lengths is a geometric series, as we saw in Part 2, and we have \sum\limits_{i=1}^\infty\frac{4}{2^{i-1}} = \sum\limits_{i=0}^\infty\frac{4}{2^i}=\frac{4}{1-\frac{1}{2}}=8, that is, all the lines we have drawn together are of length 8. So, informally speaking, we can draw this "with a pen": We will not need an infinite amount of color to draw it, even though it has infinitely many parts.

So far for this time. We have seen some nice examples of the interplay between finity and infinity, when talking about lengt. Next time it gets a bit harder, we will again look at the countability of sets.

This is the second post of my series on infinity, be sure to read the first one.

We start this chapter with a nice story, "Achilles and the Tortoise", one of Zeno's Paradoxes, though we slightly adapt it. So, Achilles races against a tortoise. He gives the tortoise a 1 mile head start. Assuming Achilles is four times as fast as the tortoise, when he arrives 1 mile, the tortoise is at 1.25 miles. When he arrives 1.25, the tortoise is at 1.3125 miles. When he arrives at 1.3125, the tortoise is at 1.328125. And so on. Whenever Achilles reaches the point the tortoise was when he started, the tortoise already went further. This way, Achilles would have to go through infinitely many points, before reaching the tortoise.

Of course, this is not really a paradoxon. The problem lies in the fact that we have an infinitary process in here. As mathematicians do, we write the problem down more formally. Let a_i be the i-th point where Achilles stands, and t_i the point where the tortoise stands. We have a_0 = 0 and t_0=1, respectively. By definition, we have a_{n+1} = t_n, because Achilles always reaches the former point of the turtle. As the turtle has a 1 mile head start, and has \frac{1}{4} times of Achilles' speed, we set t_{n+1}=1+\frac{a_{n+1}}{4}, which is t_{n+1}=1+\frac{t_{n}}{4}. It is easy to see, that t_n = 1+\frac{1}{4}+\frac{1}{16}+\ldots+\frac{1}{4^n}, that is, t_n=\sum\limits_{i=0}^n \frac{1}{4^i}, which is a shorter mathematical notation for such sums.

Using the axioms we saw in Part 1, we can prove this formally (if you are already a little confused, just skip this proof, it is not necessary for the further understanding): Assume there is an n such that t_n\neq\sum\limits_{i=0}^n\frac{1}{4^n}, then there is a smallest such n. Clearly, t_0=1=\frac{1}{4^0}=\sum\limits_{i=0}^0\frac{1}{4^i}, so n\neq 0. Therefore, it has a predecessor n-1, for which we must have t_{n-1}=\sum\limits_{i=0}^{n-1}\frac{1}{4^i}, since n was minimal. But then, t_{n} = 1+\frac{t_{n-1}}{4} = 1+\frac{1}{4}\cdot\sum\limits_{i=0}^{n-1}\frac{1}{4^i}=1+\sum\limits_{i=1}^{n}\frac{1}{4^i}=\sum\limits_{i=0}^{n}\frac{1}{4^i}. Contradiction.

Obviously, Achilles will pass the tortoise at some point, and obviously, this point is greater than all t_n. We want to find out more about this point where he passes the tortoise.



The point we are looking for, is \frac{4}{3}. In fact, this is provable. But it is a bit harder than the above induction.  What we have here, is a so-called geometric series. As we have t_{n}=1+\frac{1}{4}+\frac{1}{16}+\ldots+\frac{1}{4^n}, we have \frac{1}{4}t_{n}=\frac{1}{4}+\frac{1}{16}+\ldots+\frac{1}{4^{n+1}}, that is, \frac{1}{4}t_{n} and t_{n} differ in the terms 1 and \frac{1}{4^{n+1}}, and thus we have

t_{n}-\frac{1}{4}t_{n}=1-\frac{1}{4^{n+1}}
t_{n}(1-\frac{1}{4})=1-\frac{1}{4^{n+1}}
(\frac{3}{4})t_n=1-\frac{1}{4^{n+1}}
t_n = \frac{1-\frac{1}{4^{n+1}}}{(\frac{3}{4})}
t_n = \frac{4}{3}(1-\frac{1}{4^{n+1}})

From this, we see that, the larger n gets, the smaller \frac{1}{4^{n+1}} gets, and therefore, for very large n, t_n = \frac{4}{3}(1-\frac{1}{4^{n+1}}) approaches \frac{4}{3}(1-0)=\frac{4}{3}. In fact, if we did a little more preliminary work, this would be a valid mathematical proof, and mathematicians would write \lim\limits_{n\rightarrow\infty} t_n = \frac{4}{3}, where \infty is the symbol for "infinity", and call \frac{4}{3} the limit of this series. As we wrote \sum\limits_{i=0}^{n}\frac{1}{4^i} for the finite sum, we can also write \sum\limits_{i=0}^{\infty}\frac{1}{4^i} for its limit. The following graphic gives a certain geometric intuition of this fact:


(source)

Of course, it is not always that easy, not every infinite process has a finite outcome. For example, obviously, the infinite sum \sum\limits_{i=1}^\infty i=1+2+3+\ldots does not. However, we know that its finite parts get arbitrarily large, so we might say that \infty=\sum\limits_{i=1}^\infty i. This series diverges, while the above geometric series converges.

But even worse, consider the series \sum\limits_{i=0}^\infty (-1)^i=1-1+1-1+1-1+\ldots. Its finite parts are either 1 or 0. But you cannot find any "tendency" on what happens during infinty. This series neither converges nor diverges. \sum\limits_{i=0}^\infty (-1)^i is not well-defined.

The above example with the tortoise was one special geometric series, the general (infinite) geometric series is \sum\limits_{i=0}^\infty a^n, which converges for 0\le a<1, and then has the limit  \sum\limits_{i=0}^\infty a^n=\frac{1}{1-a} - in the case, we had a=\frac{1}{4}, for a general proof, consider the Wikipedia-article, or any good introduction to calculus.

Even more general, we have \sum\limits_{i=0}^\infty ba^n = \frac{b}{1-a}. With this formula, we can prove a fact quite a lot of people are not willing to understand: 0.\overline{9}=1, where 0.\overline{9} means the zero with infinitely many nine-digits after the decimal point. But in fact, we have 0.\overline{9}=\frac{9}{10}+\frac{9}{100}+\frac{9}{1000}+\ldots=\sum\limits_{i=0}^\infty \frac{9}{10^{n+1}}, which is \sum\limits_{i=0}^\infty \frac{9}{10}\cdot\frac{1}{10^{n}}, which is, according to our formula, \frac{9}{10}\cdot\frac{1}{1-\frac{1}{10}}=\frac{9}{10}\cdot\frac{10}{9}=1.

There is a general law about recurring decimal numbers you may know from school, namely, 0.\overline{a_1a_2\ldots a_n} = \frac{a_1a_2\ldots a_n}{99\ldots 9}. For example, from this follows the above, 0.\overline{9}=\frac{9}{9}=1. More precisely written, we have \sum\limits_{i=1}^\infty \frac{x}{10^{k\cdot i}} = \frac{x}{10^k-1}, which follows by our above formula by \sum\limits_{i=1}^\infty \frac{x}{10^{k\cdot i}}=\frac{x}{10^k}\sum\limits_{i=0}^\infty \frac{1}{10^{k\cdot i}}=\frac{x}{10^k}\cdot\frac{1}{1-\frac{1}{10^k}}=\frac{x}{10^k}\cdot\frac{1}{\frac{10^k-1}{10^k}}=\frac{x}{10^k-1}.

As you see, studying this kind of objects can get very complicated, but produces a lot of interesting outcomes. If you did not quite get the last part, do not worry, it is rather sophisticated for a non-mathematician.

#! /bin/bash                                                                                         

width=22
text=0110110011011101111001011010110101110111011\
000001000010111011101100110101101011101110110011\
0101101000100011001

(
textlen=$(echo -n "$text" | wc -c)
height=$((textlen/width))
echo new $((8*width)),$((8*height))
echo fill 10,10,255,255,255
j=0; while [ "$j" -lt "$height" ]; do
i=0; while [ "$i" -lt "$width" ]; do
ftext=${text:0:1}
text=${text:1}
if [ "$ftext" "=" "0" ]; then
echo frect $((8*i)),$((8*j)),$((8*i+8)),$((8*j+8)),0,0,0;
fi
((i++))
done
((j++))
done
echo output hi.gif
) | flydraw