Infinity. One central mathematical concept on which many myths have spread, especially in the world outside science. Therefore, I had the idea of writing a series of blog-posts about several aspects of "Infinity".

These posts are not scientific, they are for non-Mathematicians, to demystify the concept of "infinity". But neither is it one of those texts that try to include a lot of "history" and interesting side facts. I just want to give a slight understanding of infinity to non-Mathematicians.

So, in this first post, we shall have a short look at the numbers as such. First, everyone knows the non-negative integers \{0,1,2,\ldots\}. There are "infinitely many" of them, since every number n has a unique successor n+1. This should be intuitive to everybody, and in fact, it is an axiom of arithmetic: Every number has a unique successor. Additionally, we have to postulate that every number except 0 has a predecessor.

Another thing you quickly notice: Take an arbitrary sequence of decreasing non-negative integers (that is, start somewhere, and then "count down", possibly leaving steps out), then this sequence will be of finite length: You will soon or later bump into 0. A bit more complicated, think of an arbitrary set S of non-negative integers. Take, for example, the primes, the odd numbers, the even numbers, the phone numbers of your neighbourhood. You will notice that also every of these sets has a minimal element, even though the set itself may contain infinitely many non-negative integers. That is another axiom of arithmetic: Every non-empty set S of non-negative integers has a smallest element.

These two axioms plus the existence of 0 are, in fact, sufficient for elementary arithmetic. A whole lot of theorems can just be implied by these axioms. A simple example: Every number is either of the form 2n or 2n+1 - of course, this is not very profound, it essentially states that every number is odd or even, but we only want to give a simple example. To prove it, assume there was some number p which cannot be written in that way, then by our axiom, there exists a minimal such p. Since 0=2\cdot 0, we know that p\neq 0, and therefore, by axiom, it has a predecessor p-1, and as p was the smallest number which can not be written in that form, we know  that there is an n such that p-1=2n or p-1=2n+1. If p-1=2n, then p=2n+1, which we did not allow for p. If p-1=2n+1, then p=2n+2=2(n+1), which is also not allowed. This is a contradiction! So, such a p cannot exist: All numbers may be written in that form.


Found at KnowYourMeme
Now, of course, there are further objects that are commonly called "numbers". When doing accounting, you are certainly familiar with negative integers. So we get the set of all integers, \mathbb{Z}=\{\ldots, -2, -1, 0, 1, 2, \ldots\}. They do not share the property that every set of them has a smallest element. There are infinitely many of them into "both sides". So intuitively, there are "more" integers than non-negative integers - of course, the latter is a subset of the first.
But what does it mean for two sets of objects to have the same "number" of elements? Let us first look at the finite case. Let us say you are counting apples. You will take one of them and say "one", then a second one, saying "two", and a third one, saying "three", and so on, until no apple is left that was not counted. Let as assume you counted ten apples. You think you just counted the apples, but what you actually did was giving a one-to-one-mapping between apples and the set \{1,2,3,4,5,6,7,8,9,10\}. Of course, you probably did not memorize the order you chose the apples - because it does not matter which mapping you chose, the important fact is that there is such a mapping. Such a one-to-one-mapping between two sets is called a bijection. Even in our example, there are many such bijections (calculating the actual number is left as an exercise), and you only chose one by random.

So, for the finite case, we can conclude that two sets have the same number of elements, if there is a bijection between them. For the infinite case, this becomes the definition. Two infinite sets are of equal size, if and only if there is a bijection between them.

Now, let us go back to the integers. We tried to conclude that since the non-negative integers are a subset of the integers, there must be less of them. But on the other hand, define a mapping \varphi(n)=(-1)^n\lceil\frac{n}{2}\rceil, that is, \varphi(0),\varphi(1),\varphi(2),\varphi(3),\varphi(4),\ldots is 0,-1,1,-2,2,-3,3,-4,4,\ldots. This is a bijection. So in fact, there are as many non-negative integers, as there are integers at all - even though this might not be intuitive, when looking at the finite case.

In fact, this strange behaviour of infinite sets can be used to classify them: A set is finite if and only if there is no bijection into one of its proper subsets. As there are other (mostly equivalent) definitions of finity and infinity, the concept defined here is sometimes called Dedekind-Finity.


Found at Fukung.
This concept is probably hard to understand for a non-Mathematician. The problem is that "finity" is such an intuitive concept, that every non-Mathematician will postulate it was "clear". But this is not the mathematical way of thinking. We will get deeper into this topic in one of the following posts on infinity, when we look at cardinalities. For now, let us get back to numbers.

Of course, even a non-Mathematician knows that there is more than integers. We can extend the numbers to fractions. The set of fractions is called \mathbb{Q}, and it contains all fractions of the form \frac{a}{b} where a and b are integers, where b\neq 0. Fractions have a nice property: between two distinct fractions, there is always a third one. For example, between 0 and 1, there is \frac{1}{2}. Even worse, from this directly follows that between two distinct fractions, there are always infinitely many other fractions.

For example, between 0 and 1, there are the fractions \frac{1}{2}, \frac{1}{3}, \frac{1}{4}, \frac{1}{5}, \ldots. We have "infinity on finite place", so infinity does not yield distance in any sense. Even worse, we will later show that there is a bijection between integers and fractions: Even though we can squeeze infinitely many of them between 0 and 1, there are not "more" fractions than integers in the above sense.

As everybody should know from school, every fraction can also be written as a decimal fraction, with either finitely many digits or as a recurring fraction. However, we always find finitely many digits to specify them exactly.

And as many of you will probably know, there are also numbers that do not have this property. For example, \pi, the constant to calculate the area of a circle, has no finite nore a recurring decimal representation, only approximations can be written that way. In general, every infinite sequence of digits which has a dot after finitely many steps can be considered a number, like \pi, \sqrt{2}, etc., and the set of these numbers are called the real numbers, and their set is called \mathbb{R}. It can be proved that \mathbb{R} is actually bigger than \mathbb{Z}. We will later see, how this is done.


Found at Mighty Wombat.
Usually, the general education stops at \mathbb{R}, even though there is a larger set of numbers, the complex numbers, \mathbb{C}, which many non-Mathematicians do not consider as numbers, because it goes further than what they usually do with numbers. However, it is relevant in science, and mathematically, it is probably more beautiful than \mathbb{R}. Besides all real numbers, it contains, for example, the imaginary unit i, for which we have i^2=-1, it is (one) "square root" of -1, which we do not have in \mathbb{R}.


An object which is often associated with \mathbb{R} is the real line: The real numbers can be considered an infinite line which contains all of them. Then, the following graphic illustrates a bijection between the real numbers between 0 and 1 and all real numbers:



Especially interesting is that 0 and 1 can somehow be considered as the images of the negative and positive "infinity".

So far for this time.

A great article, with interesting references, and as it is about something many people do not seem to understand, worth reading:

> The abstraction-optimization tradeoff

At the station where I leave the train every week, there is a tunnel where often street musicians play their music. Mostly, their play is so bad and annoying that you want to pay them rather for stopping it, than for playing. However, today, I heard some nice melody from this tunnel, and I thought, maybe a band or something is playing there.

Approaching the source, I saw an old woman, playing on a small electrical piano. Looking a little closer, I found that some keys were missing on the keyboard, and obviously, she was only pretending to play - her keystrokes did not sync with the music, the music was obviously played by that gadget automatically.

I am not sure whether this person was snaky, or just naive to think that nobody will notice, or both. However, at least she knew how to play this gadget in a way that it does not make annoying sounds, even though that only means to choose a track and play it, it is better than what the other people do. And that wtf-feel about it somehow made me smile. I gave her 50 cent, to help her repairing the missing keys on her keyboard.

... gab letzens eine Mitarbeiterin der geliebten TU-Mensa von sich, vermutlich unüberlegt und genervt vom Riechverhalten meines Kommilitonen.

Meinen Kommentar, dies sei im Allgemeinen ein schlechtes Motto für eine Küche, hat sie offenbar nicht verstanden.

... you cannot do anything about it. It sometimes happens. And sometimes it comes so fast, that you cannot react properly on it. But mostly, you notice that you have to cough before you do it.

Especially, you can put something in front of your mouth. Even if it appears like nothing leaves your mouth, when coughing, you spit some stuff, even if you do not see it. That is why children are told to put their hands in front of their mouths.

Usually, this is not a good idea, since you will touch a lot of other stuff with your hand. One should better use either some tissue, or your elbow.

However, it appears that many people have not been told to do so. When sitting in trains or busses, especially in winter, I see people szneesing and coughing, without putting something in front of their mouth, almost every time.

This is disgusting! And it is antisocial - it makes other people sick!

Sie ist endlich da ...


... Besser spät als nie!

I already wrote about a campaign "Vergoogeln Sie keine Zeit" (do not waste your time with google) in my university's library. I usually try not to use it, but today, I was looking for a book to learn for my current tests, about algebraic curves.

Firstly, I tried to survive without technology, and went to the shelves for the topic of algebraic geometry. Since I did not find the book there, I searched with their local web interface. The main problem was probably that I misspelled the actual author, I searched for "Fulten" rather than "Fulton", but every reasonable search mechanism should be able to find the right book with a perfectly correct title given, and a little mistake on the author's name. However, of course I tried to search only for "Fulten" and only for "Algebraic Curves", the latter query returned a long list of highly specialized stuff related to algebraic curves, but not the desired book (at least not at the first three pages of at least 20).

As I did not want to waste any more time with this, I searched with Google for "Fulten Algebraic Curves". It suggested "Fulton Algebraic Curves", and using this suggestion, the first Item I found was the site of William Fulton, publishing his book as a free PDF.

I rest my case.

... an den wichtigen Stellen wird sie nur von Nieten zusammengehalten.

(via)

So, finally, this is part 4 of my series about the irrefutability of the AC from ZF. We are almost there!

Theorem 1: Let ZF-Formulae \phi_0(\vec{x}),\ldots,\phi_n(\vec{x}) be given. Then \forall_\alpha\exists_{\beta\ge\alpha}\forall_{\vec{x}\in V_\beta} \bigwedge\limits_{i=0}^n (\phi_i(\vec{x})\leftrightarrow\phi_i^{V_\beta}(\vec{x})). We then say that V_\beta reflects the \phi_i.
Proof: Let \alpha be given. Without loss of generality we may assume that the \phi_i can be written without universal quantifiers, as we can replace every universal quantifier by a negated existential quantifier of the negated quantfied formula.
Recursively define a sequence \left<\beta_n \mid n<\omega\right> where \beta_0=\alpha and \beta_{n+1} is the smallest ordinal \gamma>\beta_n, such that if \exists_y\theta(\vec{z},y) is a subformula of one of the \phi_i, we have \forall_{\vec{z}\in V_{\beta_n}}((\exists_y\theta(\vec{z},y))\rightarrow\exists_{y\in V_\gamma}\theta(\vec{z},y)). Let \beta be the supremum of the \beta_i. It is sufficient to show that for all subformulae \psi(\vec{z}) of any \phi_i we have \forall_{\vec{z}\in V_\beta}(\psi(\vec{z})\leftrightarrow\psi^{V_\beta}(\vec{z})), which goes by structural induction. If \psi is of the form x\in y this is trivial. The step over \rightarrow,\wedge,\bot is clear. We can assume that every occurence of \forall is in the scope of an occurence of \exists, and therefore, instead of having an induction step for \forall, we have an induction step for \exists: \psi=\exists_y\theta(\vec{z},y), and let \vec{z}\in V_\beta. By induction hypothesis, \psi^{V_\beta}(\vec{z})\rightarrow\psi(\vec{z}). For the other direction, let \exists_y\theta(\vec{z},y), then \exists_y\in V_{\beta_{n+1}}\theta(\vec{z},y). By induction hypothesis, we have \exists_{y\in V_\beta}\theta^{V_\beta}(\vec{z},y).

We can do everything we did in Part 1 as well directly in ZF. Alphabets become sets, and formulae become finite sequences. Especially, there is a set of all formulae, which is countable. Notice that there is no need to fix a special "encoding" for formulae, it is clear that one exists, and it is clear that all such encodings have to be equivalent. For a ZF-formula \phi, let \lceil\phi\rceil be its embedding into ZF, that is, \lceil\phi\rceil is a set.

For every set u we can define an interpretation, interpreting \in by \{\left<x,y\right>\in u\times u \mid x\in y\}. We write u\models \phi if this interpretation (for all valuations) models \lceil\phi\rceil, notice that in this case, \models is a relation inside ZF. Trivially, we have \forall_{\vec{a}\in u}((u\models\phi(\vec{a})) \Leftrightarrow\phi^u(\vec{a})).

Definition: Now we can define the class of OD of ordinal definable sets:

OD:=\{x \mid \exists_{\alpha\in On}\exists_{\vec{p}\in\alpha}\exists_{\lceil\phi\rceil} x=\{y\in V_\alpha \mid  V_\alpha\models \phi(y,\vec{p})\}\}

where we write \exists_{\lceil\phi\rceil} to denote that there must exist an encoded ZF-formula.

This class is interesting, as it messes around with the intuition about two meta-layers. However, it is not yet our goal.

Definition: The transitive closure TC(x) of a set x is the smallest y\supseteq x such that every element of y is also subset of y.

Definition: The class HOD of hereditary ordinal definable sets is defined by

HOD:=\{x \mid TC(\{x\})\subseteq OD\}

Since x\in TC(\{x\}), trivially HOD\subseteq OD.

Lemma 1: x\in HOD \Leftrightarrow x\in OD\wedge x\subseteq HOD.
Proof: "=>": From x\in HOD directly follows TC(\{x\}\subseteq OD, therefore, x\in OD. For y\in x we have TC(\{y\})\subseteq TC(\{x\})\subseteq OD\subseteq HOD, that is, y\in HOD, therefore, x\subseteq HOD.
"<=": For x\in OD and TC(\{x\})\subseteq HOD we have TC(\{x\})=\{x\}\cup\bigcup_{y\in x} TC(\{y\})\subseteq OD. Therefore, x\in HOD.

Lemma 2: For every ZF-formula \phi(x,\vec{y}) we have \forall_{\vec{\gamma}\in On}(\{x \mid \phi(x,\vec{\gamma})\}\in V\rightarrow\{x \mid \phi(x,\vec{\gamma})\}\in OD.
Proof: Let \{x \mid \phi(x,\vec{\gamma)}\in V, \alpha such that \{x \mid \phi(x,\vec{\gamma)}\in V_\alpha, and \vec{\gamma}<\alpha. By Theorem 1 there is a \beta\ge\alpha such that V_\beta reflects \phi, therefore \{x \mid \phi(x,\vec{\gamma})\} = \{x\in V_\beta \mid \phi^{V_\beta}(x,\vec{\gamma})\} = \{x\in V_\beta \mid V_\beta\models\phi(x,\vec{\gamma})\}\in OD.

Theorem 2: HOD is an inner model.
Proof: Due to Part 3, it is sufficient to show (I1), (I2) and (I3). By Lemma 1, (I1) holds.
Let x\subseteq HOD, and \alpha such that x\subseteq V_\alpha. Then trivially, x\subseteq V_\alpha\cap HOD. But by Lemma 2 we have V_\alpha\cap HOD=\{x \mid x\in V_\alpha\wedge x\in HOD\}\in OD. Trivially, we also have V_\alpha\cap HOD\subseteq HOD, so by Lemma 1, V_\alpha\cap HOD\in HOD. Thus, (I2) holds.
For (I3), let \phi(x,\vec{z}) be a \Sigma_0-formula, choose arbitrary \vec{z},u\in HOD, and let a=\{x\in u \mid \phi(x,\vec{z})\}. Then a\subseteq u\subseteq HOD by (I1), so by Lemma 1, it is sufficient to show that a\in OD. As HOD\subseteq OD we have \vec{z},u\in OD, therefore there are \vec{\lceil\phi\rceil},\lceil\psi\rceil and \vec{\alpha},\gamma\in On and for i=1,\ldots,n there are p_i\in \alpha_i^{<\omega}, q\in\gamma^{<\omega} sucht hat z_i = \{y\in V_{\alpha_i} \mid V_{\alpha_i}\models\phi_i(y,p_i)\} and u=\{y\in V_\gamma \mid V_\gamma\models\psi(y,q)\}. Let \alpha=\max\{\vec{alpha},\gamma\}, and \chi(x,y)=\lceil y=V_x\rceil. By Theorem 1 we get a \beta>\alpha such that V_\beta reflects \chi. Therefore

a=\{x\in V_\beta  \mid  V_\beta\models \exists_{\vec{w}}\exists_w(\chi(\gamma, w)\wedge
(\bigwedge\limits_{i=1}^n\chi(\alpha_i,w_i))\wedge
\exists_{\vec{v}}\forall_y (\psi^w(x,q)\wedge\phi(x,\wedge{v})\wedge\bigwedge\limits_{i=1}^n (y\in v_i\leftrightarrow\phi_i^{w_i}(y,p_i))))\}

which is in OD. Therefore, a\in HOD.

Lemma 3: There exists a surjective functor F:On\rightarrow OD.
Proof: In Part 2 we proved that there is a bijection G:On\rightarrow\omega\times On^{<\omega}, so it is sufficient to give a surjective functor F:\omega\times On^{<\omega}\rightarrow OD. Now let \left<\phi_n \mid n<\omega\right> be an enumeration of all ZF-formulae. Define

F(n,p) = \left\{\begin{array}{rl} \{x\in V_{p(m)} \mid V_{p(m)}\models\phi_n(x,p \mid _m)\} & \mbox{ for } \operatorname{dom}(p)=m+1 \\ \emptyset & \mbox{ otherwise}\end{array}\right.

Then F is surjective.

Lemma 4: There exists a surjective functor H:On\rightarrow HOD
Proof: Since HOD\subseteq OD, use H=F \mid _{HOD}.

Theorem: (AC) holds in HOD
Proof: By Lemma 4, we have our surjective H:On\rightarrow HOD. For \alpha\in On, define g_\alpha= H \mid _\alpha. By Lemma 2 we know that for every \alpha, g_\alpha\in OD, and since g_\alpha\subseteq HOD, we have g_\alpha\in HOD. Let u\in HOD be arbitrary but fixed. Then there exists an \alpha such that u\subseteq \operatorname{ran}(g_\alpha). Now define a well-ordering r\subseteq u\times u by

\{\left<x;y\right> \mid \min\{\gamma \mid x=g_\alpha(\gamma)\}<\min\{\gamma \mid y=g_\alpha(\gamma)\}

r is a well-ordering, and r\in HOD. Thus, in HOD, the well-ordering theorem holds, therefore, AC holds in HOD.

This is part 3 of my posts about the irrefutability of AC in ZF.

We define a hierarchy of sets, the V-hierarchy.
  • V_0:=\emptyset
  • V_{\kappa+1}:=\mathbb{P}(V_{\kappa})
  • V_\lambda := \bigcup_{\gamma<\lambda} V_\gamma for limes-ordinals \lambda
Furthermore, we have V_\alpha\subseteq V_\beta for \alpha\le\beta, which can be shown by a trivial inductive argument.

Theorem: We have V=\bigcup_{\gamma\in On} V_\gamma.
Proof: Assume there was an x\in V which is not in V_\alpha for some \alpha. Then it cannot be subset of any V_\alpha, since otherwise it would be element of V_{\alpha+1}. Therefore it must contain an element x' which is not in V_\alpha for every \alpha as well. That is, every set not being in any of the V_\alpha must contain a set not being in any of the V_\alpha, hence, we would get an infinite \ni-chain, which contradicts (FUN). □

Definition: The relativization \phi^W of a formula \phi regarding a class W is defined inductively over the structure on \phi by:
  • \bot^W := \bot
  • (x=y)^W := (x=y)
  • (x\in y)^W := x\in y
  • (\phi\wedge\psi)^W:=\phi^W\wedge\psi^W
  • (\forall_x \phi)^W := \forall_x (x\in W \rightarrow \phi^W)
Intuitively, \phi^W means that \phi holds in W. We will later write W\models\phi to denote that \phi^W holds, defining yet another additional meaning for \models.

Definition: A class W is called inner model of ZF, if it satisfies
  • for all x\in W we have x\subseteq W
  • On\subseteq W
  • for all axioms \phi of ZF we have ZF\models\phi^W
Remark: V is an inner model of ZF, since \phi=\phi^V.

Inner models provide a way of getting new models from models of ZF:

Remark:
Let W be an inner model of ZF, and {\mathcal M} be a model of ZF with domain M and the \in-relation interpreted by a relation \epsilon\subseteq M^2. Let N=\{a\in M|{\mathcal M}\models a\in W\} and \mathcal N be an interpretation with domain N and \in-relation \epsilon\cap N^2. Then {\mathcal N}\models ZF.

Therefore, it is sufficient to give an inner model of ZF satisfying AC to show that AC is not refutable: From a model of ZF, we would get a model of ZF satisfying AC. That is what we are going to do. Before we can do this, we have to do some work to give an alternative classification of inner models.

Definition: A formula is called \Sigma_0-formula, if every quantifier is bounded, that is, every universal quantifier is of the form \forall_x (x\in y\rightarrow \phi), and equivalently, every existential quantifier is of the form \exists_x (x\in y \wedge \phi). (Recall that our language does not contain existential quantifiers directly, but defines them by universal quantifiers. However, our definition is equivalent to what we would get directly.)

Notice that x=y is a \Sigma_0-formula.

We write \forall_{x\in y} \phi for \forall_x (x\in y\rightarrow \phi), and \exists_{x\in y} \phi for \exists_x (x\in y\wedge\phi). Furthermore, if we want to denote a list of variables, then we use the vector notation \vec{x}, and we will use \forall_{\vec{x}} to denote the universal quantification over all of these variables.

Theorem I: Let W be a class. Then the following propositions are equivalent:
  • a. W is an inner model
  • b. W satisfies:
    • (I1) Every element of W is a subset of W
    • (I2) For every x\subseteq W there is an element y\in W such that x\subseteq y
    • (I3) For every \Sigma_0-formula \phi(x,\vec{z}) we have \forall_{\vec{z}\in W}\forall_{u\in W} \{x\in u|\phi(x,\vec{z})\}\in W
To proof Theorem I, some additional work needs to be done. Firstly, we will replace the replacement scheme by two other schemes:

\Sigma_0-comprehension: For every \Sigma_0-formula \phi(x,\vec{z}) we have \forall_{\vec{z}}\forall_u\{x\in u|\phi(x,\vec{z})\}\in V.

Limitation: For every ZF-formula \phi(x,y,\vec{z}) we have \forall_{\vec{z}}\forall_u\exists_v\forall_{x\in u} ((\exists_y \phi(x,y,\vec{z}))\rightarrow \exists_{y\in v}\phi(x,y,\vec{z}))

Theorem:
These schemes are derivable from ZF.
Proof: \Sigma_0-comprehension follows by general comprehension, which follows by (RPL). For limitation, let \phi, \vec{z} and u be given, and a=\{x\in u|\exists_y\phi(x,y,\vec{z})\}. Define a function g:a\rightarrow On, g(x)=\min\{\gamma|\exists_{y\in V_\gamma}\phi(x,y,\vec{z})\}; this is well-defined, since such a \gamma always exists according to the above theorem. Then the range of g is a set of ordinals, and therefore has a supremum \alpha. Now set v=V_\alpha. Then v contains an y for every x, such that \phi(x,y,\vec{z}).


We now show the converse, these schemes imply replacement.

Lemma: For every ZF-formula \phi(\vec{x}), \forall_{u_1}\ldots\forall_{u_n}\{\left<\vec{x}\right>\in u_1\times\ldots\times u_n|\phi(\vec{x})\}\in V can be shown by \Sigma_0-comprehension and limitation, not using full comprehension.
Proof: Recall that for pairs \left<x,y\right>:=\{\{x,y\},\{x\}\}\}. We show that from u,v\in V follows u\cap v,u\backslash v,u\times v\in V: u\cap v=\{x\in u\cup v| x\in u\wedge x\in v\}, u\backslash v = \{x\in u|x\not\in v\} are sets by \Sigma_0-comprehension. By limitation, there exists a set b such that for all x\in u we have an y\in b with \forall_{z\in v} \left<x,z\right>\in y. That is, b contains a superset of \{x\}\times v for every x\in u, that is \bigcup_{r\in b}\{r\} is a superset of u\times v, thus, u\times v=\{a \in \bigcup_{r\in b}\{r\}| \exists_{x\in u}\exists_{y\in v} a=\{\{x,y\},\{x\}\}\} is a set due to \Sigma_0-comprehension.

We now use structural induction on \phi(\vec{x}).

If \phi(\vec{x})=\bot, then \emptyset is the desired set.

If \phi(\vec{x}) is of the form x_i\in x_j, let u_1,\ldots,u_n be given and let W=\{\left<\vec{x}\right>\in u_1\times\ldots\times u_n|\phi(\vec{x})\}=\{z\in u_1\times\ldots\times u_n|\exists_{x_1\in u_1}\ldots\exists_{x_n\in u_n} (z=\left<x_1,\ldots,x_n\right>\wedge x_i\in x_j)\}, and \exists_{x_1\in u_1}\ldots\exists_{x_n\in u_n} (z=\left<x_1,\ldots,x_n\right>\wedge x_i\in x_j) is a \Sigma_0-formula, which means that W is a set by \Sigma_0-comprehension.

If \phi(\vec{x}) is of the form \phi_1(\vec{x})\wedge\phi_2(\vec{x}), then \{\left<x_1,\ldots,x_n\right>\in u_1\times\ldots\times u_n|\phi(\vec{x})\}=\{\left<x_1,\ldots,x_n\right>\in u_1\times\ldots\times u_n|\phi(\vec{x})\}\cap\{\left<x_1,\ldots,x_n\right>\in u_1\times\ldots\times u_n|\phi(\vec{x})\} and these are both sets by induction.

For \phi(\vec{x})=\phi_1(\vec{x})\rightarrow\phi_2(\vec{x}) notice that \phi(\vec{x})\leftrightarrow \lnot (\phi_1(\vec{x})\wedge\lnot\phi_2(\vec{x})), therefore it is sufficient to show the induction step for negations. Thus, let \phi(\vec{x})=\lnot\phi_3(\vec{x}). Then \{\left<\vec{x}\right>\in u_1\times\ldots\times u_n|\phi(\vec{x})\}=u_1\times\ldots\times u_n \backslash \{\left<\vec{x}\right>\in u_1\times\ldots\times u_n|\phi_3(\vec{x})\} which is a set by induction.

For \phi(\vec{x}) = \forall_y \phi_1(\vec{x},y), by limitation we have a set v\in V with \forall_{x_1\in u_1}\ldots\forall_{x_n\in u_n}((\exists_y \lnot\phi_1(\vec{x},y))\rightarrow\exists_{y\in v}\lnot\phi_1(\vec{x},y)). By induction hypothesis, q=\{\left<\vec{x},y\right>\in u_1\times\ldots\times u_n\times v|\phi_1(\vec{x},y)\}\in V, and therefore a=u_1\times\ldots\times u_n\times v-q=\{\left<\vec{x},y\right>\in u_1\times\ldots\times u_n\times v|\lnot\phi_1(\vec{x},y)\}\in V. Therefore, by \Sigma_0-comprehension, b=\{\left<\vec{x}\right>\in u_1\times\ldots\times u_n|\exists_x\lnot\phi_1(\vec{x})\}=\{z\in u_1\times\ldots\times u_n|\exists_{y\in v}\left<z,y\right>\in a\}\in V, therefore, \{\left<\vec{x}\right>\in u_1\times\ldots\times u_n|\phi(\vec{x})\}=u_1\times\ldots\times u_n-b\in V.

Lemma: For \Sigma_0-formulae \phi(\vec{x}) and classes W such that x\in W\Rightarrow x\subseteq W follows \forall_{\vec{x}\in W} (\phi(\vec{x})\leftrightarrow\phi^W(\vec{x})).
Proof: By structural induction. For everything except quantors, the relativization can be looped through. For quantors, we may assume that we always have an existential quantifier, since \forall_x\phi\leftrightarrow\lnot\exists_x\lnot\phi. So let \phi(\vec{x})=\exists_{z\in y}\psi(z,\vec{x},y). We may assume that x\neq y, since otherwise the formula would be equivalent to \bot. Let \vec{x},y\in W, then by induction hypothesis \psi^W(z,\vec{x},y)\rightarrow\psi(z,\vec{x},y), and therefore \phi^W(\vec{x},y)\rightarrow\phi(\vec{x},y). For the other direction, assume \phi(\vec{x},y), that is, there is some z\in y such that \psi(z,\vec{x},y). Then y\in W implies y\subseteq W which implies z\in W, so by induction hypothesis, \psi^W(z,\vec{x},y), and since z\in W we have \exists_{z\in y\cap W}\psi^W(z,\vec{x},y), therefore \phi^W(\vec{x}).

Corollary: For \Sigma_0-formulae \psi(\vec{x},\vec{y}) and classes W in which every element is a subset we have \forall_{\vec{y}\in W}((\forall_{\vec{x}}\psi(\vec{x},\vec{y}))\rightarrow\forall_{\vec{x}\in W}\psi^W(\vec{x},\vec{y}) and \forall_{\vec{y}\in 
W}((\exists_{\vec{x}}\psi(\vec{x},\vec{y}))\leftarrow\exists_{\vec{x}\in
 W}\psi^W(\vec{x},\vec{y})

Theorem: From \Sigma_0-comprehension and limitation follows replacement.
Proof: Let F be a functor, and u\in v. By limitation we get a v\in V with \forall_{x\in u} ((\exists_y F(x)=y)\rightarrow \exists_{y\in v} F(x)=y). Then \{F(x)|x\in u\}=\{y\in v|\exists_{x\in u} F(x)=y\}\in V by the above Lemma.


We finally arrive at...

Proof of Theorem I: "a=>b": (I1) by definition. For (I2), consider the V_\alpha-hierarchy inside W, that is, define W_\alpha := V_\alpha^W. For x\in V,x\subseteq W we can find an \alpha such t hat x\subseteq W_\alpha, and W_\alpha\in W. For (I3), let \phi(x,\vec{z}) be a \Sigma_0-formula and let \vec{z}\in W. Because of (I1) we have \phi(x,\vec{z})\leftrightarrow\phi^W(x,\vec{z}) for all x\in W according to the above lemma. Therefore, for every u\in W, \{x\in u|\phi(x,\vec{z})\}=\{x\in u\cap W|\phi^W(x,\vec{z})\}, which is in W by comprehension.
"b=>a": Trivially, On\cap W is transitive, and as every transitive set of ordinals is an ordinal, either On\cap W\in On or On\cap W = On. Assume On\cap W\in On, then by (I2) we have an y\in W such that On\cap W\subseteq y. But On\cap W = \{x\in y|x\in On\} which is in W by (I3), so On\cap W\in On\cap W. Contradiction. Therefore, On\subseteq W. We show the ZF-axioms. (EXT) and (FUN) hold because of the above Corollary, (NUL) and (INF) hold since \omega,\emptyset\in On\subseteq W.  For (UNI), let x\in W, then x\subseteq W and therefore every y\in x is subset of W, therefore \bigcup_{y\in x}y\in W and according to (I2) we find a superset u\in W. But then, \bigcup_{y\in x}y = \{y\in u|\exists_{z\in x} y\in z\}\in W by (I3). Similar for (PAR). For (POW), let x\in W. Then by (I2), since \mathbb{P}(x)\cap W\subseteq W, there exists an u\in W such that \mathbb{P}(x)\cap W\subseteq u. Then by (I3), \mathbb{P}(x)\cap W=\{z\in u|z\subseteq x\}\in W.
We have proven that replacement can be replaced by \Sigma_0-comprehension and limitation, therefore, we show \Sigma_0-comprehension and limitation in W. For \Sigma_0-comprehension, let \phi(x,\vec{z}) be a \Sigma_0-formula. We have to show that \forall_{\vec{z}\in W}\forall_{u\in W}\{x\in u|\phi^W(x,\vec{z})\}\in W. But by the above Lemma, this is the same set as \{x\in u|\phi^W(x,\vec{z})\}, which is in W according to (I3). For limitation, consider again a \Sigma_0-formula \phi(x,y,\vec{z}). Then we have to show \forall_{\vec{z}\in W}\forall_{\vec{u}\in W}\exists_{v\in W}\forall_{x\in U}((\exists_{y\in W}\phi^W(x,y,\vec{z}))\rightarrow\exists_{y\in v}\phi^W(x,y,\vec{z})). Let \vec{z},u\in W, then there exists a v'\in V such that \forall_{x\in u}((\exists_{y\in W}\phi^W(x,y,\vec{z}))\rightarrow\exists_{y\in v'} (y\in W\wedge \phi^W(x,y,\vec{z}))). By (I2) we get a v\in W such that v'\cap W\subseteq v, which trivially satisfies the above formula.