 # Set Theory

Set Theory

Set Theory is a branch of mathematics that investigates sets and their properties. The basic concepts of set theory are fairly easy to understand and appear to be self-evident. However, despite its apparent simplicity, set theory turns out to be a very sophisticated subject. In particular, mathematicians have shown that virtually all mathematical concepts and results can be formalized within the theory of sets. This is considered to be one of the greatest achievements of modern mathematics. Given this achievement, one can claim that set theory provides a foundation for mathematics.

The foundational role of set theory and its mathematical development have raised many philosophical questions that have been debated since its inception in the late nineteenth century. For example, here are three: Does infinity exist, and if so, are there different kinds of infinity? Is there a mathematical universe? Are all mathematical problems solvable?

Before pursuing the philosophical issues concerning set theory, one should be familiar with a standard mathematical development of set theory. This article presents such a development. A companion article addresses the philosophical issues that are raised by set theory and its development.

In the late nineteenth century, the mathematician Georg Cantor (1845–1918) created and developed a mathematical theory of sets. This theory emerged from his proof of an important theorem in real analysis. In this proof, Cantor introduced a process for forming sets of real numbers that involved an infinite iteration of the limit operation. Cantor’s novel proof led him to a deeper investigation of sets of real numbers and to his theory of abstract sets. Cantor’s creation now pervades all of mathematics and offers a versatile tool for exploring concepts that were once considered to be ineffable, such as infinity and infinite sets.

Sections 1 and 2 below describe the “naïve” principles of set theory that were used and developed by Cantor. Then, Section 3 describes a more sophisticated (axiomatic) approach to set theory that arose from the discovery of Russell’s paradox. After identifying the Zermelo-Frankel axioms of set theory, Section 4 discusses Cantor’s well-ordering principle and examines how Cantor used the well-ordering principle to develop the ordinal and cardinal numbers. Section 5 considers controversies concerning the well-ordering principle and its equivalent, the axiom of choice. This is followed by introducing the cumulative hierarchy of sets, Kurt Gödel’s universe of constructible sets, and Paul Cohen’s method of forcing in Sections 6, 7, and 8, respectively. The latter two topics, explored in Sections 7 and 8, can be used to show that certain questions are unresolvable when assuming the Zermelo-Frankel axioms (with or without the axiom of choice). The next two sections address further developments in set theory that are intended to settle these and other unresolved questions; namely, Section 9 discusses large cardinal axioms, and Section 10 investigates the axiom of determinacy.

On the Origins
Cantor’s Development of Set Theory
The Zermelo-Fraenkel Axioms
The Axioms
Classes
Cantor’s Well-Ordering Principle
Ordinal Numbers
Cardinal Numbers
The Axiom of Choice
On Zermelo’s Proof of the Well-Ordering Principle
The Cumulative Hierarchy
Gödel’s Constructible Universe
Cohen’s Forcing Technique
Large Cardinal Axioms
The Axiom of Determinacy
Concluding Remarks
Primary Sources
Secondary Sources
Internet Sources
1. On the Origins

Let us first discuss a few basic concepts of set theory. A set is a well-defined collection of objects. The items in such a collection are called the elements or members of the set. The symbol “” is used to indicate membership in a set. Thus, if is a set, we write to say that “ is an element of ,” or “ is in ,” or “ is a member of .” We also write to say that is not in . In mathematics, a set is usually a collection of mathematical objects, for example, numbers, functions, or other sets.

Sometimes a set is identified by enclosing a list of its elements by curly brackets; for example, a set of natural numbers can be identified by the notation

More typically, one forms a set by enclosing a particular expression within curly brackets, where the expression identifies the elements of the set. To illustrate this method of identifying a set, we can form a set B of even natural numbers, using the above set , as follows:

which can be read as “the set of such that is even.” Of course,

is even.

It is difficult to identify the genesis of the set concept. Yet, the idea of a finite collection of objects has existed for as long as the concept of counting. Indeed, mathematicians have been investigating finite sets and methods for measuring the size of finite sets since the beginning of mathematics. For example, the above two sets

are finite sets. As every element in is an element in , the set is said to be a subset of , denoted by . Since there are elements in that are not in , we say that is a proper subset of . Moreover, the number of elements in is strictly smaller than the number of elements in . Thus, one can say, “the whole is greater in size than its proper part .”

We view the sets and as existing entities that both contain infinitely many elements. Thus, and are “completed infinities.” Observe that every element in is in , and that is a proper subset of . However, if, as many mathematicians once believed, “infinity cannot be greater than infinity,” then the whole is not greater in size than its proper part . This counterintuitive result was viewed by many early prominent mathematicians as being contradictory, as it appeared to conflict with the well-understood behavior of finite sets. These mathematicians thus concluded that the concept of a “completed infinity” should not be allowed in mathematics.

For this reason, before Cantor, a majority of mathematicians considered infinite collections to be mathematically illicit objects. Cantor was the first mathematician to view infinite sets as being legitimate mathematical objects that can coexist with finite sets. Clearly, the size of a finite set can be measured simply by counting the number of elements in the set. Cantor was the first to investigate the following question:

Can the concept of “size” be extended to infinite sets?

Cantor addressed this question in the affirmative by using the concept of a function to measure and compare the sizes of infinite sets. Functions are widely used in science and mathematics. For sets and , we say that is a function from to , denoted by : , if and only if is a relation (operation) that assigns to each element in , a single element in . There are three important properties that a function might possess:

: is an injection if and only if for each in there is at most one in such that .
: is a surjection if and only if for each in there is at least one in such that .
: is a bijection if and only if for each in there is exactly one in such that .

Observe that : is an injection if and only if distinct elements in are assigned to distinct elements in ; that is, for all and in , if , then . Also note that : is a bijection if and only if : is an injection and a surjection.

Cantor observed that two sets and have the same size if and only if there is a one-to-one correspondence between and , that is, there is a way of evenly matching the elements in with the elements in . In other words, Cantor noted that the sets and have the same size if and only if there is a bijection : . In this case, Cantor said that and have the same cardinality. For an illustration, let be the set of natural numbers and let be the set of even natural numbers. Now let : be defined by . One can verify that : is a bijection and, thus, we obtain the following one-to-one correspondence between the set of natural numbers and the set of even natural numbers:

Hence, each natural number corresponds to the even number , and each even natural number is thereby matched with . The bijection : specifies a one-to-one match-up between the elements in and the elements in . Cantor concluded that the sets N and E have the same cardinality.

Cantor also defined what it means for a set to be smaller, in size, than a set . Specifically, he said that has smaller cardinality (smaller size) than if and only if there is an injection : but there is no bijection : . Cantor then proved that there is no one-to-one correspondence between the set of real numbers and the set of natural numbers. Cantor’s proof showed that the set of real numbers has larger cardinality than the set of natural numbers (Cantor 1874). This stunning result is the basis upon which set theory became a branch of mathematics.

The natural numbers are the whole numbers that are typically used for counting. The real numbers are those numbers that appear on the number line. For example, the natural number , the integer , the fraction , and all of the other rational numbers are real numbers. The irrational numbers, such as and , are also real numbers. Again, let be the set of natural numbers, and let be the set of real numbers. If a set is either finite or has the same cardinality as the set of natural numbers, then Cantor said that it is countable. Since the set of real numbers is larger, in size, than the set of natural numbers , Cantor referred to the set as being uncountable.

After proving that the set of real numbers is uncountable, Cantor was able to prove that there is an increasing sequence of larger and larger infinite sets. In other words, Cantor showed that there are “infinitely many different infinites,” a result with clear philosophical and mathematical significance.

After his introduction of uncountable sets, in 1878, Cantor announced his Continuum Hypothesis (CH), which states that every infinite set of real numbers is either the same size as the set of natural numbers or the same size as the entire set of real numbers. There is no intermediate size. Cantor struggled, without success, for most of his career to resolve the Continuum Hypothesis. The problem persisted and became one of the most important unsolved problems of the twentieth century. After Cantor’s death, most set theorists came to believe that the Continuum Hypothesis is unresolvable.

Cantor’s profound results on the theory of infinite sets were counterintuitive to many of his contemporaries. Moreover, Cantor’s set theory violated the prevailing dogma that the notion of a “completed infinity” should not be allowed in mathematics. Thus, the outcry of opposition persisted. Influential mathematicians continued to argue that Cantor’s work was subversive to the true nature of mathematics. These mathematicians believed that infinite sets were dangerous fictional creations of Cantor’s imagination and that Cantor’s fictions needed to be eradicated from mathematics (Dauben 1979, page 1) (Dunham 1990, pp. 278-280). Nevertheless, Cantor’s theory of sets soon became a crucial tool used in the discovery and establishment of new mathematical results, for example, in measure theory and the theory of functions (Kanamori 2012). Mathematicians slowly began to see the utility of set theory to traditional mathematics. Accordingly, attitudes started to change and Cantor’s ideas began to gain acceptance in the mathematical community (Dauben 1979, pp. 247-248). The significance of Cantor’s mathematical research was eventually recognized. David Hilbert, a prominent twentieth century mathematician, described Cantor’s work as being

the finest product of mathematical genius and one of the supreme achievements of purely intellectual human activity. (Hilbert 1923)

Ultimately, Cantor’s theory of abstract sets would dramatically change the course of mathematics.

2. Cantor’s Development of Set Theory

In his development of set theory, Cantor identified a single fundamental principle, called the Comprehension Principle, under which one can form a set. Cantor’s principle states that, given any specific property concerning a variable , the collection is a set, where is the set of all objects that satisfy the property . For example, let be the property that “ is an odd natural number.” The Comprehension Principle implies that

is a set. Employing the Comprehension Principle, one can form the intersection of two sets and using the property “ and “; thus, the intersection of and is the set

and .

One can also form the set

or

which is called the union of and . Recall that one writes to mean that is a subset of , that is, every element of is also an element of . Using the Comprehension Principle, one can form the power set of , which is the set whose elements are all of the subsets of , that is,

Thus, if is a set and , then . So, if and , then

,
, and

where denotes the empty set, that is, the set that contains no elements. The Comprehension Principle was an essential tool that allowed Cantor to form many important sets. Cantor’s approach to set theory is often referred to as naïve set theory.

Cantor’s set theory soon became a very powerful tool in mathematics. In the early 1900s, the mathematicians Émile Borel, René-Loius Baire, and Henri Lebesgue used Cantor’s set theoretic concepts to develop modern measure theory and function theory (Kanamori 2012). This work clearly demonstrated the great mathematical utility of set theory.

The philosopher and mathematician Bertrand Russell was interested in Cantor’s work and, in particular, Cantor’s proof of the following theorem, which implies that the cardinality of the power set of a set is larger than the cardinality of the set. First, recall that a function : is a surjection (or is onto ) if for all , there is an such that .

Cantor’s Theorem. Let be a set. Then there is no surjection : .

Proof. Suppose, for the sake of obtaining a contradiction, that there exists a surjection : . Observe that, for all , . By the Comprehension Principle, let be the set

and .

Clearly, . Thus, . As is onto , there is an such that . There are two cases to consider: either or . If , then the definition of implies that . Since , we have that , which is a contradiction. On the other hand, if , then the definition of implies that . Since , we see that , a contradiction. Thus, there is no surjection : . This completes the proof.

In 1901, after reading Cantor’s proof of the above theorem, that was published in 1891, Bertrand Russell discovered a devastating contradiction that follows from the Comprehension Principle. This contradiction is known as Russell’s Paradox. Consider the property “”, where represents an arbitrary set. By the Comprehension Principle, we conclude that

is a set. The set consists of all the sets that satisfy . Clearly, either or . Suppose . Then, the definition of the set implies that must satisfy the property , which contradicts our supposition. Suppose . Since satisfies , we infer, from the definition of , that , which is also a contradiction.

There were similar paradoxes discovered by others, including Cantor (Dauben 1979), but Russell’s paradox is the easiest to understand. These paradoxes appeared to threaten Cantor’s fundamental principle that he used to develop set theory. Nevertheless, Cantor did not believe that these paradoxes actually refuted his development of set theory. He knew that the construction of certain collections can lead to a contradiction. Cantor referred to these collections as “inconsistent multiplicities.” Today, such collections are called proper classes, and the paradoxes can be used to prove that they are not sets.

3. The Zermelo-Fraenkel Axioms

Over time, it became clear that, to resolve the paradoxes in Cantor’s set theory, the Comprehension Principle needed to be modified. Thus, the following question needed to be addressed:

How can one correctly construct a set?

Ernst Zermelo (1871–1953) observed that to eliminate the paradoxes, the Comprehension Principle could be restricted as follows: Given any set and any property , one can form the set , that is, the collection of all elements that satisfy , is a set. Zermelo’s approach differs from Cantor’s method of forming a set. Cantor declared that for every property one can form a set of all the objects that satisfy the property. Zermelo adopted a different approach: To form a set, one must use a property together with a set.

Zermelo also realized that in order to more fully develop Cantor’s set theory, one would need additional methods for forming sets. Moreover, these additional methods would need to avoid the paradoxes. In 1908, Zermelo published an axiomatic system for set theory that, to the best of our knowledge, avoids the difficulties faced by Cantor’s development of set theory. In 1930, after receiving some proposed revisions from Abraham Fraenkel, Zermelo presented his final axiomatization of set theory, now known as the Zermelo-Fraenkel axioms and denoted by ZF. These axioms have become the accepted formulation of Cantor’s ideas about the nature of sets.

a. The Axioms

As noted by Zermelo, to avoid paradoxes, the Comprehension Principle can be replaced with the principle: Given a set and a property with a variable , the collection is a set. However, this raises a new question: What is a property? The most favored way to address this question is to express the axioms of set theory in the formal language of first-order logic, and then declare that its formulas designate properties. This language involves variables and the logical connectives (and), (or), (not), → (if … then …), and ↔ (if and only if), together with the quantifier symbols (for all) and (there exists). In addition, this language uses the relation symbols and (as well as and ). In this language, the variables and quantifiers range over sets and only sets. A formula constructed in this formal language is referred to as a formula in the language of set theory. Such formulas are used to give meaning to the notion of “property.”

We now illustrate the expressive power of this set theoretic language. The formula asserts that “the set is nonempty,” and states that “ has no elements.” Moreover states that “it is not the case that there is a set that contains all sets as elements.” In addition, one can translate English statements, which concern sets, into the language of set theory. For example, the English sentence “the set contains at least two elements” can be translated into the language of set theory by .

There is another quantifier, called the uniqueness quantifier, that is sometimes used. This quantifier is written as and it means that “there exists a unique satisfying .” This is in contrast with , which simply states that “at least one satisfies .” The uniqueness quantifier is used as a convenience, as the assertion can be expressed in terms of the other quantifiers and ; namely, it is equivalent to the formula

The above formula is equivalent to because it asserts that “there is an such that holds, and any sets and that satisfy and must be the same set.”

The Zermelo-Fraenkel axioms are listed below. Each axiom is first stated in English and then written in logical form. After each logical form, there is a discussion of the axiom and some of its consequences. When reading these axioms, keep in mind that, in Zermelo-Fraenkel set theory, everything is a set, including the elements of a set. Also, the notation means that are free variables in the formula and that is allowed to contain parameters (free variables other than ) that represent arbitrary sets.

a)

Extensionality Axiom. Two sets are equal if and only if they have the same elements.

The extensionality axiom is essentially a “definition” that states that two sets are equal if and only if they have exactly the same elements.

b)

Empty Set Axiom. There is a set with no elements.

The empty set axiom states that there is a set which has no elements. Since the extensionality axiom implies that this set is unique, we let denote the empty set.

c)

Subset Axiom. Let be a formula. For every set , there is a set that consists of all the elements such that holds.

(The variable is assumed not to appear in the formula .) The subset axiom, also known as the axiom of separation, asserts that any definable sub-collection of a set is itself a set, that is, for any formula and any set , the collection is a set. Clearly, the subset axiom is a limited form of the Comprehension Principle. Yet, it does not lead to the contradictions that result from the Comprehension Principle. The subset axiom is, in fact, an axiom schema since it yields infinitely many axioms-one for each formula .

d)

Pairing Axiom. For every and , there is a set that consists of just and .

The pairing axiom states that, for any two sets and , the set exists. Thus, by the extensionality axiom, the set exists.

e)

Union Axiom. For every set , there exists a set that consists of all the elements that belong to at least one set in .

The union axiom states that, for any set , there is a set whose elements are precisely those elements that belong to an element of , that is, if and only if for some . The extensionality axiom implies that the set is unique; it is often denoted by . For example, consider the set . Then

belongs to a member of or .

For another example, let . Then .

f)

Power Set Axiom. For every set , there exists a set that consists of all the sets that are subsets of .

The power set axiom states that, for any set , there is a set, which we denote by , such that for any set , if and only if .

g)

Infinity Axiom. There is a set that contains the empty set as an element and whenever , then .

The infinity axiom ensures the existence of at least one infinite set. For any set , the successor of is defined to be the set . Thus, the axiom of infinity asserts that there is a set such that and if , then . Note that , and that . It follows that the set contains each of the sets

One can show that any two of the sets in the above list (separated by a semi-colon) are distinct. Hence, the set contains an infinite number of elements; in other words, is an infinite set. So, the infinity axiom simply states that infinite sets exist and are legitimate mathematical objects. The infinity axiom is a key tool that is used to develop the set of natural numbers and to prove that is well-ordered, that is, every nonempty set of natural numbers has a least element.

h)

Replacement Axiom. Let be a formula. For every set , if for each there is a unique such that , then there is a set that consists of all of the elements such that for some . (Below, ! is the uniqueness quantifier.)

(The variable is assumed not to appear in the formula .) The replacement axiom states that for every set , if for each there is a unique such that , then the collection : is a set; that is, a “functional image of a set, is a set.” The replacement axiom is a special form of Cantor’s Comprehension Principle that plays a critical role in modern set theory. However, the replacement axiom does not lead to the contradictions that follow from the Comprehension Principle. Like the subset axiom, the replacement axiom is an axiom schema. Accordingly, there are infinitely many Zermelo-Fraenkel axioms.

i)

Regularity Axiom. Each nonempty set contains an element that is disjoint from .

The regularity axiom, also known as the axiom of foundation, states that, for any nonempty set , there is a set such that . The regularity axiom rules out the possibility of a set belonging to itself. In standard mathematics, there are no sets that are members of themselves. For example, the set of natural numbers is not a natural number. The regularity axiom eliminates collections that are not relevant for standard mathematics. The regularity and pairing axioms imply that if , then . To see this, suppose that . Then it follows, from regularity, that . So .

The Zermelo-Fraenkel axioms are now the most widely accepted answer to the question: How can one correctly construct a set? Of course, these axioms are more restrictive than Cantor’s Comprehension Principle; however, no one, in over 100 years, has been able to derive a contradiction from these axioms. Moreover, all of the classic results (excluding the paradoxes) that were derived using Cantor’s naïve set theory can be derived from the Zermelo-Fraenkel axioms.

It is a remarkable fact that essentially all mathematical objects can be defined as sets within Zermelo-Fraenkel set theory. For example, functions, relations, the natural numbers, and the real numbers can be defined within Zermelo-Fraenkel set theory. Hence, effectively all theorems of mathematics can be considered as statements about sets and proven from the Zermelo-Fraenkel axioms.

b. Classes

The argument used in Russell’s Paradox can be applied to prove, in ZF, that there is no set that contains all sets (as elements). As every set is equal to itself, the collection : contains every set, but this collection is not a set. Thus, given a formula , one cannot necessarily conclude that the collection is a set. However, in set theory, it is convenient to be able to discuss such collections. They cannot be called sets. Instead, a collection of the form : is called a class. The collection is a class that is not a set; for this reason, it is called a proper class.

When can one prove that a class is a set? Let us say that a class is bounded if and only if there is a set such that for all , if , then . Using the subset axiom, one can prove that a bounded class is a set. It follows that the class is not bounded.

In the Zermelo-Fraenkel axioms, there is no explicit mention of classes. However, there are alternative axiomatizations of set theory that extend ZF by including classes as objects in the language, that is, these axiom systems give classes a formal state of existence. The most common such axiomatic treatment of classes is denoted by NBG (von Neumann–Bernays–Gödel). The NBG system uses a formal language that has two different types of variables: capital letters denote classes and lowercase letters denote sets. In addition, classes can contain only sets as elements. So, a class that is not a set cannot belong to a class. Thus, a class is a set if and only if . In the NBG system, sets satisfy all of the ZF axioms, and the intersection of a class with a set is a set, that is, is a set. The NBG system also has the class comprehension axiom:

where the formula can contain set parameters and/or class parameters (with other restrictions). Thus, the class comprehension axiom asserts that is a class.

The NBG system is a conservative extension of ZF; that is, a sentence with only lowercase (set) variables is provable in NBG if and only if it is provable in ZF. The Zermelo-Fraenkel system has a clear advantage over NBG, namely, the simplicity of working with only one type of object (sets) rather than two types of objects (sets and classes). The Zermelo-Fraenkel axiomatic system is the standard system of axioms for modern set theory.

4. Cantor’s Well-Ordering Principle

As proposed by Cantor, two sets and have the same cardinality if and only if there is a bijection : . When is a finite set, there is a unique natural number, denoted by ||, that identifies the number of elements in . In this case, we say that || is the cardinality of . For example, if , then || . Clearly, the cardinality of a finite set identifies the number of elements that are in the set. Moreover, if and are both finite sets, then one can prove that

|| = || if and only if there exists a bijection : .

()

With this understanding, Cantor asked the following question:

Are there values that can represent the size of infinite sets and satisfy ()?

In other words, given two infinite sets and , can one assign values || and || such that

|| = || if and only if there exists a bijection : ?

Cantor answered this question, in the affirmative, by developing the transfinite ordinal numbers, which are “infinite numbers” in the sense that they are larger than all of the natural numbers, and are well-ordered just like the natural numbers. Cantor believed that each infinite set can be assigned a specific ordinal number and that this ordinal number would measure the size of the set. Cantor realized that, in order to successfully apply his theory of ordinal numbers, he needed an additional principle. In 1883, he proposed the following principle.

Well-Ordering Principle: It is always possible to bring any well-defined set into the form of a well-ordered set.

A relation on a set is a well-ordering of if and only if it is a total ordering in which every non-empty subset of has a least element, where it is assumed that the relation does not apply to any elements that are not in . If a set can be well-ordered, then one can generalize the concepts of induction and recursion, similar to mathematical induction, on the elements of the set. Given any infinite set, Cantor used the well-ordering principle to identify an ordinal number that measures the size of the set. Such an ordinal is called a cardinal number.

a. Ordinal Numbers

The natural numbers are often used for two purposes: to indicate the position of an element in a sequence and to identify the size of a finite set. In other words, a natural number can be used to identify a position (first, second, third, …) and it can be used to identify a size (one, two, three, …). Cantor extended the natural numbers by introducing the concepts of transfinite position and transfinite size. Suppose that we want to count the number of real numbers. As noted in Section 1, Cantor proved that the set of real numbers is uncountable. Thus, if we attempted to assign each real number to exactly one of the natural numbers then we would not have enough natural numbers to complete this task. However, suppose that we add some new numbers, called transfinite ordinals, to our stock of numbers. Clearly, we need an ordinal that will identify the first position that occurs after all of the natural numbers. Cantor denoted this ordinal by the Greek letter . That is, Cantor proposed the following “position” sequence

.

(1)

Observe the following:

By starting with and repeatedly adding , we obtain all of the natural numbers.
Every natural number greater than has an immediate predecessor; for example, has as its immediate predecessor.

By contrast, the ordinal number cannot be obtained by repeatedly adding to and it does not have an immediate predecessor. For these reasons, we say that is a limit ordinal.

We can continue the sequence (1) by repeatedly adding to . By doing so, we obtain the following position sequence:

(2)

The process for constructing (1) and (2) can be repeated endlessly. In this way, we obtain the ordered sequence of all of the ordinals:

(3)

where is a limit ordinal which is usually represented by . An ordinal of the form is called a successor ordinal. An ordinal > that is not a successor ordinal is called a limit ordinal. Cantor used the ordinals to measure the “length” of a well-ordered set.

The natural numbers are sometimes called finite ordinals. Every nonempty subset of the natural numbers has a least element. Similarly, every nonempty set of ordinals has a least element with respect to the ordering in (3). The ordinal numbers are a generalized extension of the natural numbers. One can define the operations of addition, multiplication, and exponentiation on the ordinal numbers. These operations satisfy some (but not all) of the arithmetic properties that hold on the natural numbers, for example, addition is associative (Cunningham 2016).

The set of predecessors of an ordinal is the set of all of the ordinals that come before it in the list (3); for example, the set of predecessors of and are the respective sets

, .
(4)

The ordinals and represent different positions in the list (3); but, the sets and in (4) have the same cardinality. Note that the cardinality of is larger than any finite set, that is, for any natural number , the set has cardinality larger than the set . For this reason, we say that is a cardinal number.

For any two ordinals and , we say that < if and only if appears before in the list (3). For each ordinal , let Pred() = < be the set of predecessors of . One can prove, in ZF, that Pred() is a set. In contemporary set theory one usually defines the ordinals so that, for each ordinal , = Pred; that is, each ordinal is defined to be the set of its predecessors. Specifically, a set is said to be an ordinal if and only if is well-ordered by the membership relation and is transitive, that is, every element in is a subset of . Thus, if < , then and . For example, is an ordinal if the integers (the finite ordinals) are defined as follows: , , , , . This approach is due to Von Neumann (Kunen 2009), and such ordinals can be called Von Neumann ordinals. The collection of all ordinals is a proper class (see Cunningham 2016). b. Cardinal Numbers An ordinal number is said to be a cardinal if and only if, for all < , the set Pred() has smaller cardinality than Pred(). It follows that the natural numbers are all cardinals. As noted above, is the first transfinite cardinal, which is often denoted by . The next transfinite cardinal, after , is designated by . This process can be continued to produce the following sequence of finite and transfinite cardinals: (5)  where the transfinite cardinal numbers in (5) are indexed by the ordinal numbers. Thus, the collection of all the cardinal numbers is a proper class. A cardinal is called a successor cardinal if and only if is a successor ordinal; otherwise, it is called a limit cardinal. One can prove, in ZF, that, for every cardinal , there is an ordinal such that (Cunningham 2016). Thus, every cardinal appears on the list (5). One can define the operations of addition, multiplication, and exponentiation on the cardinals (exponentiation requires the well-ordering principle). These particular operations are not the same as the corresponding operations on the ordinal numbers (Cunningham 2016). Cantor used the cardinal numbers to measure the “size” of sets. The well-ordering principle implies that every set A can be assigned a (unique) cardinal number that measures its size. This cardinal number is usually denoted by ||, and is called the cardinality of . Cantor’s Theorem implies that, for any set , || < ||. The operation of cardinal exponentiation allowed Cantor to prove that the cardinality of , the set of real numbers, is equal to , that is, || = . Since is the first cardinal greater than , Cantor was able to express the Continuum Hypothesis in terms of the equation . Moreover, assuming the well-ordering principle, one can conclude that a set is countable if and only if || and that a set is uncountable if and only if ||. Infinite cardinals come in two distinct forms: regular or singular. An infinite cardinal is said to be a regular cardinal if and only if is not the union of a set consisting of less than many smaller cardinals. Thus, if is a regular cardinal, is a set of cardinals smaller than , and || < , then . Assuming the well-ordering principle, it follows that each successor cardinal is a regular cardinal. When a cardinal is not regular, it is called a singular cardinal. One can show that an infinite cardinal is singular if and only if there exists an ordinal < and a function : Pred() → Pred() such that for all < there is an ordinal < such that < . It follows that is a singular cardinal. 5. The Axiom of Choice At the third International Congress of Mathematicians at Heidelberg in 1904, Julius König submitted a proof that the well-ordering principle is false; in particular, he presented an argument showing the set of real numbers cannot be well-ordered. On the next day, Ernst Zermelo identified an error in König’s purported proof. Shortly after the Heidelberg congress, Zermelo (Moore 2012) discovered a proof of the following theorem, which implies that the error found in König’s proof cannot be removed. Well-Ordering Theorem: Every set can be well-ordered  In his clever proof of the well-ordering theorem, Zermelo formulated and applied the following principle, which he was the first to identify. Axiom of Choice (AC). Let be a set of nonempty sets. Then there is a function such that, for each set in , .  The function mentioned in AC is called a choice function for the set . Informally, the axiom of choice asserts that, for any collection of nonempty sets, it is possible to uniformly choose exactly one element from each set in the collection. When is a finite set, one can prove, in ZF, that there exists a choice function. Today, mathematicians use the axiom of choice when the set is infinite and it is not clear how to define or construct a desired choice function. Zermelo applied the axiom of choice to establish the well-ordering theorem. The well-ordering theorem validates both Cantor’s well-ordering principle and that every set can be assigned a cardinal number that measures its size. a. On Zermelo’s Proof of the Well-Ordering Principle Zermelo’s proof of the well-ordering theorem is the first mathematical argument that explicitly invokes the axiom of choice. As a result, the proof can be viewed as an important moment in the development of modern set theory. For this reason, we now present a summary of this proof. Let be a nonempty set and let be the set of all nonempty subsets of ; that is, let : .  Let be a choice function for . Call a set a -set if and only if there is a well-ordering of such that, for each , ∶ ≮ .  Thus, each element is the element that the choice function selects from the set of all elements in that do not (strictly) precede in the ordering . For example, if , then one can show that is a -set. Thus, -sets exist. Let be a -set with well ordering and let be a -set with well-ordering . In his proof, Zermelo showed that either and continues or and continues , where we say that continues when the order only adds new elements that are greater than all of the elements ordered by . Zermelo also showed that the union of all of the -sets is a -set and that this union equals . Therefore, can be well-ordered. Essentially, the axiom of choice states that one can make infinitely many arbitrary choices. As noted above, Cantor’s acceptance of infinite sets led to a dispute among some of Cantor’s contemporaries. Similarly, Zermelo’s axiom of choice incited further controversy concerning the infinite. The main objection to the axiom of choice was the obvious one: How can the existence of a choice function be justified when such a function cannot be defined or explicitly constructed? Surprisingly, many of the axiom’s severest critics had unwittingly applied the axiom in their own work. In the decades following its introduction, the axiom of choice gained acceptance among most mathematicians; in part, this was because the axiom of choice is a very useful principle whose deductive strength is required to prove many important mathematical theorems (Moore 2012). Moreover, the axiom of choice is equivalent to a number of seemingly unrelated principles in mathematics. For example, in ZF, the axiom of choice is equivalent to Zorn’s lemma, the well-ordering theorem, and the comparability theorem (see Cunningham 2016). The Zermelo-Fraenkel system of axioms is denoted by ZF and the axiom of choice is abbreviated by AC. The axiom of choice is not one of the axioms in ZF. The result of adding the axiom of choice to the system ZF is denoted by ZFC. There were many unsuccessful attempts to prove the axiom of choice assuming only the axioms in ZF. As a result, mathematicians began to doubt the possibility of proving the axiom of choice from the axioms in ZF and, eventually, it was shown that such a proof does not exist. The combined work of Kurt Gödel, in 1940, and Paul Cohen, in 1963, confirmed that the axiom of choice is independent of the Zermelo-Fraenkel axioms, that is, AC cannot be proven or refuted using just the axioms in ZF. Nevertheless, the axiom of choice is a powerful tool in mathematics and there are many significant theorems that cannot be established without it. Consequently, mathematicians typically assume the axiom of choice and often cite it when they use it in a proof. b. Banach-Tarski Paradox Set theory frequently deals with infinite sets. Moreover, as we have seen, there are times when infinite sets have properties that are unlike those of finite sets. Such properties of infinite sets can appear to be counter-intuitive or paradoxical, because they conflict with the behavior of finite sets or with our limited intuition. Cantor proved a theorem that illustrates this fact. Let denote the unit interval , that is, the set of all real numbers such that . Let denote the unit square in the plane, that is, the set of all ordered pairs such that such that and . The sets and appear in the following figure: Cantor initially believed that the set of points in the two-dimensional square must have cardinality much larger than the set of points in the one-dimensional interval . Then he discovered a proof showing that his initial intuition was wrong. Cantor’s theorem below, which can be proven without the axiom of choice, shows the sets and have the same cardinality. Theorem (Cantor). There exists a bijection : .  One can use the bijection : to proclaim that one can, theoretically, disassemble all of the points in the interval and then reassemble these points to obtain the unit square . This, of course, is counter-intuitive, as we know that one cannot cut-up a 1-foot piece of thread and then put the pieces together to obtain a square-foot piece of fabric. Thus, there are infinite abstract objects that do not behave in the same way as finite concrete objects. We now present a theorem due to Stefan Banach and Alfred Tarski (1924). The proof of this theorem uses the axiom of choice, in an essential manner, to prove another counter-intuitive result. Some have claimed that this theorem thus refutes the axiom of choice. First, we identify some terminology. In three-dimensional space, a unit ball is a set of points of distance less than or equal to from a fixed central point. Theorem (Banach, Tarski). A unit ball in three-dimensional space can be split into five pieces that can be rigidly moved, rotated, and put back together to form two unit balls.  The Banach–Tarski Theorem is often referred to as a paradox because it is counter-intuitive; for example, the theorem implies that, theoretically, one can split a solid glass ball into five pieces and then use the pieces to create two new glass balls of the same size as the original. However, in the proof of the theorem, the five pieces that are formed are not solids that have a measurable volume; they are five complex infinite sets of points. We repeat: there are infinite abstract objects that do not behave in the same way as finite concrete objects. The conclusion of the Banach–Tarski Theorem does not refute the axiom of choice, and Cantor’s above theorem does not render the axioms of set theory false. Ever since the ancient Greeks, there have been results in mathematics that were once viewed as being counter-intuitive. Such results eventually become better understood and, as a result, become more intuitive themselves. 6. The Cumulative Hierarchy Zermelo’s 1904 proof of the well-ordering theorem resembles von Neumann’s 1923 proof of the transfinite recursion theorem, a powerful tool in set theory. A formula is said to be functional if and only if ; that is, for all , there is a unique such that . Given a functional formula, , consider the class of ordered pairs ∶ .  Since is functional, one can view as a class function (that is, a functional class), and thus, is a set whenever is a set. Let | denote the function obtained by restricting the domain of to the set . The replacement axiom implies that | is a set whenever is a set. Transfinite Recursion Theorem: Let be a functional formula. Then there is a class function such that, for all ordinals , |.  The transfinite recursion theorem is used to define what is commonly known as the cumulative hierarchy of sets and usually denoted by , which satisfies (see figure below) , , for any ordinal , : < , for any limit ordinal .   One obtains by repeatedly applying the power set operation at successor ordinals and by taking the union of all the previous sets at limit ordinals. In particular, and : <   The regularity axiom implies that for every set , there exists an ordinal such that . For this reason, the proper class is called the universe of sets. It follows that each set is in and that all of the axioms in ZF are true in . In addition, as one ascends the “ordinal spine,” one obtains sets of ever greater complexity that become better and better approximations to (see above figure). This is confirmed by the reflection principle (see below) which, in essence, asserts that any statement that is true in , is also true in some set . Let be a formula in the language of set theory with free variables . For any ordinal and , we write   to mean that is true in . The following theorem of ZF, due to Azriel Levy (Levy 1960) and Richard Montague (Montague 1961), implies that any specific truth that holds in likewise holds in some initial segment of ; in fact, it holds in unboundedly many initial segments. Reflection Principle: Let be a formula and let be an ordinal. Then there is an ordinal > such that, for all is true in if and only if .

As a corollary, for any finite number of formulas that hold in , the reflection principle implies that all of these formulas also hold in some . As noted before, there are an infinite number of axioms in ZF. Montague (Montague 1961) used the reflection principle to conclude that if ZF is consistent, then ZF is not finitely axiomatizable. Hence, ZF is not equivalent to any finite number of the axioms in ZF. This follows from Gödel’s second incompleteness theorem (see Kunen 2011, page 8), which implies that, if ZF is consistent, then one cannot prove, in ZF, the existence of a set model of ZF, that is, a set such that , for every axiom in ZF.

7. Gödel’s Constructible Universe

As we have seen, the cumulative hierarchy of sets is constructed in stages. At successor stages, one adds all possible subsets of the previous stage and, at limit stages, one takes the union of all of the previously produced sets. To prove that the axiom of choice and the Continuum Hypothesis are consistent with ZF, Kurt Gödel (1938) constructed the “inner model” of commonly known as the universe of constructible sets. As we will see, is a subclass of . The idea behind Gödel’s construction of is to modify the cumulative hierarchy structure so that the end result will produce a (smaller) class that satisfies ZF. For any set , define to

is definable over

where is definable over means that there are in and a formula such that, for all in ,

if and only if .

One can show, in ZF, that is a class function (Moschovakis 2009, 8D). Using the transfinite recursion theorem and the “definable subset” operation , Gödel defined the class by applying the operation at successor ordinals and by taking the union of all of the previous sets at limit ordinals. The class satisfies the following (see figure below):

,
, for any ordinal ,